How can i extract gene name from custom GTF file?

Hello,

I am currently working on analyzing my RNA-seq data using the Reference-based RNA-seq method as outlined in this tutorial: Link.

To summarize, I have created a reference genome in .fasta format and an annotation file in .gtf format.

During the annotation of my DESeq2 results using the “Annotate DESeq2/DEXSeq output tables” tool, I’ve observed that the result file for host genes (human) contains a ‘gene name’ column, while the result file for pathogen genes (virus) lacks this column (N/A). Notably, the original GTF file for the virus includes ‘gene name’ information in the attributes column.

I am seeking assistance in identifying the underlying issue causing the absence of the ‘gene name’ column for pathogen genes in the DESeq2 result file.

Thank you for any insights or guidance you can provide.

1 Like

Hi @Dongjoon

Have you adjust the features that the tool is interpreting to better match your specific annotation?

Find these options by expanding the Advanced Options section.

Screenshot

If you need more help, please share these pieces of data and we can troubleshoot more:

  1. A few lines of a DESeq2 output
  2. A few lines of your GTF file (data lines, not headers). Lines that you’d expect to match the lines from the DESeq2 output would be the most helpful.
  3. Options you’ve set for this tool. The table of inputs/parameters from the job details page is what the tool is reading in, so that is the best way to share this part. Copy/paste. If you expand the datasets, then all three will be covered.

Let’s start there, thanks! :slight_smile:

Reference: FAQ: Extended Help for Differential Expression Analysis Tools