Automatically acquiring and adding NCBI data

jennaj · July 18, 2019, 6:00pm

The position of genes/transcripts in a reference annotation dataset (GTF, GFF3) will be with respect to the reference genome.

iGenomes hosts several E. coli genomes/builds: iGenomes

Download the archive, unpack it, then update the files you want to work with to Galaxy.

None of these are based on the UCSC version of the genome that is indexed at Galaxy Main https://usegalaxy.org, so load just not the annotation GTF data but the genome fasta as well. DO not use/assign the pre-indexed genomes to your data, or expect mismatch problems.

Then use the genome fasta as a Custom Genome+Build with tools as needed. The fasta may need to be run through NormalizeFasta first. The genome and annotation need to be an exact match at the chromosome level.

Details in these FAQs:

Galaxy tutorials for reference:

It is not clear what format your inputs are in or if you assembled the transcriptome yourself. Please explain more about your data (how created and/or source) and your larger goals if the above doesn’t help.

Thanks!

Topic		Replies	Views
Adding a reference genome to map RNAseq contig usegalaxy.org support custom-genome , transcriptomics	1	17	April 16, 2025
rice reference genome (FASTA) and annotation genome (GFT)	0	412	May 18, 2020
Adding a New Reference Genome for Cannabis sativa usegalaxy.org support custom-genome , reference-annotation , reference-genome	1	7	July 29, 2025
Expanding the built-in reference genomes usegalaxy.eu support custom-genome , reference-annotation , reference-genome , custom-build	2	348	July 12, 2023
Custom reference genome troubleshooting reference-index , custom-genome	1	144	January 30, 2024

Automatically acquiring and adding NCBI data

Related topics