My rule based uploader recipe for EBI ENA

pvanheus · September 2, 2020, 9:42am

The Rule based uploader tutorial by @jmchilton and @hexylena provides an example of creating a list of dataset pairs. I find a slightly adapted version of the procedure very useful for loading collections of sequences from EBI’s European Nucleotide Archive into Galaxy. Since once can now save and share “recipes” from the uploader I though I’d share my recipe here.

My starting point is a project page in ENA, e.g. the one for PRJNA522942, which is data from this paper by Chen et al. On that page, choose to download the report in TSV format (i.e. click on the TSV - see the image below where that link is highlighted):

In the rule based uploader choose to upload data as a collection and choose to load the tabular data from a pasted table. Paste the TSV report from ENA into the form and click the build button.

In the rule based uploader, click the spanner icon (highlighted in the image below), paste the recipe from here, give your new collection a name and click Upload.

rule

Then sit back and wait. Galaxy will download the files from ENA and create a list of paired dataset collections with the name you gave it. Note that this recipe filters out non-paired end datasets - it will not work with single-ended data or ENA projects that contain a mix of single and paired end data.

pvanheus · September 3, 2020, 8:16am

btw I have noticed that due to network problems on the EBI ENA side (variable network speeds and connection problems when doing automated retrieval from ENA are common) this procedure does not work for all ENA projects. Perhaps some additional work on the Galaxy uploader can help fix this.

bjoern.gruening · September 8, 2020, 2:10pm

Super cool, thanks for sharing!

Topic		Replies	Views
Creating a Collection with Rules from the History usegalaxy.eu support upload , collections , guides	0	870	March 13, 2020
Incomplete uploading from ENA's EBI SRA -- Solution: try NCBI SRA instead usegalaxy.eu support upload , ncbi , get-data	2	779	July 2, 2021
import dataset NCBI usegalaxy.eu support gtn-tutorial , upload , blast , third-party-identities	11	444	May 12, 2023
paired dataset download fails usegalaxy.org support gtn-tutorial , metagenomics , mothur	8	711	April 13, 2023
how to create a paired dataset collection from the files which are already paired collections	11	1035	July 28, 2023

My rule based uploader recipe for EBI ENA

Related topics