jennaj
October 8, 2025, 4:24pm
3
Hello again – I wanted to clarify about the reference annotation GTF file for you or anyone else reading this later on. The guide above explains how to location annotation that is based on the same base assembly for human data.
The first section contains the information you will probably be interested in if you are using the server index for the hg38 assembly itself.
Human example → Homo Sapiens GRCh38 hg38
The version of hg38 hosted as a native built-in index at UseGalaxy servers was sourced from UCSC.
Other data provider sources of GRCh38/hg38 can be slightly different.
[..] (see the full guide for details about how assemblies can differ)
Choice 1: native reference genome, native database key, and user supplied reference annotation
UCSC hosts GTF reference annotation in their Downloads area from a few different gene annotation tracks based on their hg38 assembly. These would work with tools without extra manipulations (copy URL + Upload with auto detect defaults == ready to use). This would allow you to assign the hg38 database metadata key to any datasets based on the same assembly version (basepairs) and labeling (identifiers). This would also allow you to link out to more display applications like IGV, UCSC, and others “automatically”.
Other data sources are possible … but might need minor labeling or format adjustments. Gencode hosts annotations that are based on this same hg38 base assembly. Removing the # header lines is a good idea.
We have lots of prior Q&A about this! reference-genome reference-annotation
Examples:
UCSC as the annotation source