Feature count Fatal error: Exit code 255 ()

ERROR: failed to find the gene identifier attribute in the 9th column of the provided GTF file.
The specified gene identifier attribute is ‘exon’
An example of attributes included in your GTF annotation is ‘ID=exon-XR_003111846.3-1;Parent=rna-XR_003111846.3;Dbxref=GeneID:112587351,RFAM:RF00026,Genbank:XR_003111846.3;gbkey=ncRNA;gene=LOC112587351;inference=COORDINATES: profile:INFERNAL:1.1.1;product=U6 spliceosomal RNA;transcript_id=XR_003111846.3’
Can somebody explain its solution in a easy way. Running rna seq data on domestic water buffalo. genome and gene annotation file from the same database from ncbi. With thanks in anticipation.

Hi @rkd

You data content seems to be in GFF3 format, not GTF. You can get the GTF from NCBI.

Context links

Then you have a choice of reference files. I would choose these (if I guessed the species you are interested in correctly?):

You should be able to import those two files by URL into a Galaxy history using the Upload tool with all default settings. Then, run some data cleanup to get the format into a very simple, basic specification. Example where I did this for another genome is here →

I did two things:

  1. Ran NormalizeFasta on the reference genome fasta to remove the description content from the > title lines. This isolates the chromosome identifiers in a way STAR and many other tools will expect.

  2. Ran Select to remove the # header lines from the reference annotation. Data providers include header lines for provenience reasons but many (most?) tools expect a stricter format that does not include any headers. So, remove them to avoid errors. You can keep a copy of the original file for your records if you need to check or cite any of that header information later.

More details about what I am suggesting → FAQ: Extended Help for Differential Expression Analysis Tools

And, if you really want to use GFF3 data instead, that is possible, but has scientific considerations since different data points will be used for the summaries. If interested, this is one topic where that is explored. → Featurecounts error using a gene annotation from a gff3 file - #2 by jennaj.

Please give that a try. :slight_smile:

Thank you very much. Feature count worked. But one more query plz. I used domestic water buffalo gtf for feature count but now when I am proceeding further for annotating my ID in Organism column there is no inbuilt for buffalo in it and I have to select bos taurus which is a different species from buffalo. Is it going to affect my analysis? If yes, it’s solution pl. in a simple way. With thanks in anticipation.

Hi @rkd

Glad you have Featurecounts working!

Then for AnnotateMyIDs, if your exact species assembly is not supported with a native index, then it will not work for you.

However, you could use this tool instead, using your DESeq2 output and annotation file →

  • Annotate DESeq2/DEXSeq output tables Append annotation from GTF to differential expression tool outputs

Please give that a try! :scientist: