Custom reference genome annotation

AlexaDean · January 18, 2021, 8:14pm

Hello,

I have the complete genome sequence (fasta format) from NCBI for my reference genome, but no access to any GTF/GFF3 file (it is a phage identified in our lab). I normalized the fasta file using NormalizeFasta. My understanding is that this is sufficient for the bwa-mem step, but I am a little unsure how to create my own gtf/gff3 file for downstream counting step (planning on using featurecounts or htseq). Any advice would be appreciated.
Alexa

jennaj · January 18, 2021, 9:04pm

Hi @AlexaDean

Please see the GTN tutorials here:

and maybe here, too:

AlexaDean · January 18, 2021, 9:25pm

Hello,

Sorry I think I was unclear. The genome in NCBI has been annotated (there is an associated GenBank file). Let me be more clear: I downloaded just the fasta file of the genome. For bwa-mem, I input the custom reference genome (normalized fasta file) and selected build index. I am running this now. Next I want to perform featurecounts or htseq which to my knowledge would require the gtf/gff3 file. I am wondering what file I would use here?

jennaj · January 18, 2021, 9:42pm

The annotation from NCBI can be loaded into Galaxy. There is usually a separate annotation file available but you can also extract one from a GenBank file (tool: Genbank to GFF3 converter).

FAQs that may help to organize/format/label the inputs to avoid conflicts: Galaxy Support - Galaxy Community Hub

AlexaDean · January 25, 2021, 7:23pm

Hello,

Just to follow up, I realize now the gff3 does not work with featurecounts, so this does not work for my problem

jennaj · February 14, 2025, 9:24pm

For anyone else reading – you can convert reference annotation from any format into any other format with the tool gffread. The FAQs above have the full details.

And example of preparing data from NCBI – both the genome and the annotation – is in this topic. → Inquiry about lettuce genome - #2 by jennaj

Topic		Replies	Views
Genbank to gtf for featurecounts	6	2273	January 29, 2021
HOw do you create a CTF/GFF file in Galaxy?	7	107	July 9, 2024
How to add a new reference-genome on HISTAT2? I need S. agalactiae BM110 usegalaxy.eu support reference-genome	5	217	July 1, 2024
Adding new Reference genomes to the DeepVariant deep learning-based variant caller usegalaxy.eu support custom-genome , mapping , transcriptomics , reference-annotation , reference-genome , custom-build , featurecounts	1	772	January 26, 2023
Why HISAT2 indexer builder requires gtf but not gff in the advanced indexing option?	1	1977	June 3, 2019

Custom reference genome annotation

Related topics