FeatureCounts Persistent Fatal error: Exit code 255

Hi,

I am trying to use the featurecounts tool. I have read through the different community support questions of people having similar issues, and I have tried to follow the advices I understood, but none of them have helped. The message says

"ERROR: failed to find the gene identifier attribute in the 9th column of the provided GTF file.
The specified gene identifier attribute is ‘gene_id’
An example of attributes included in your GTF annotation is ‘gene_id’.

In the field “GFF feature type filter” i have tried with gene, exon and CDS. And in the field “GFF gene identifier” I have tried with either “gene_id”, “gene_name” “gene_source” and non of those optiones have worked. I have verified in my gtf file and indeed the 9th colum is “gene_id”.

I am using the h38 human genome assembly. I am pretty new to galaxy as well, so i would appreciate explanations for “beginners”.

I have followed the next GalaxyHelp links without success, so anything different will be appreciated link1, link2link3link4

Thank you!
Adriana

Hi Adriana,
check this thread:

Is it similar to your situation?
If you work with hg38, consider using featureCounts with built-in gene model

You can also post a fragment of the gene annotation you use with featureCounts.

I get reliable results with GenCode annotations.

Hope that helps.

Kind regards,
Igor

Hi Igor,

Thank you for your reply. My issue is not similar to the one shared in the thread, as the tool stops before being able to give results at all. See below the error shown:

I tried using the built-in gene option but it is not “available” to use, see below.

I figure this could be because I am using a collection? So if because of that I cannot use the built-in option, how can I sort out the issue? I have verified and my gtf file does show “gene_id” in the 9th column, so the GFF gene identifier should be correct, although the error claims otherwise.

Thank you again.

Best,
Adriana

Hi Adriana,
featureCounts has a little bug affecting built-in genomes

It should work with collections.

The error log points on issue with the annotation file. Maybe paste one gene annotation in reply and I’ll check it.

Alternatively, you can share the history with me, I can have a look. In History Option (small triangle icon at the top right corner of the history panel) select Share or Publish, in the middle window make history accessible, copy the URL link and paste it into reply. Ideally, it should be a small history. You can copy relevant datasets into a new history and share it.

Kind regards,
Igor

Hi Igor,

Sorry for the late reply, thank you for helping out. I tried again with the built-in genome but I do think it has a bug, it does not seem to recognize it. I somehow got lucky last week and it worked for one (which is why i delayed my reply), but if I try to run it again I face the same issue:

b5dac2d0794d5ed7c47463a0796070a4502a6d8a_2_690x326

I have made a new History that contains only the relevant datasets, hopefully you can have a look at it.

Once again, thank your for the assistance.

Best,

Adriana

1 Like

Hi @Adriana_PGS

Thanks for sharing the history, very helpful and I could spot exactly what is going wrong.

You are mapping against the Human T2T assembly → CHM13_T2T_v2.0

There are a few reasons not to use that assembly for transcriptome analysis, some of which is covered in this guide → Reference genomes at public Galaxy servers: GRCh38/hg38 example

So …

Technical issue → only four genomes are supported with the built-in annotation. It is noted on the tool form, and your screenshot. Your data was not mapped against one of those, so the option is not available.

Scientific issue → Transcriptome analysis will be much more stable when using hg38. Try mapping against that version of the human genome instead. Review my guide, and try an internet search, to understand why.

Hope that helps!

Hi Jennifer,

Thank you for your reply, however in the Bowtie2 mapping I am doing it with hg38 assembly (as well as one in a different history not linked here), and in either of those I am able to select the hg38 built-in genome option. The same image I shared above appears of “! Please provide a value for this option. Select built-in genome. No options available”.

So I have that problem when mapping against hg38 with Bowtie2 as seen in the linked history or with those (Bowtie2, HISAT2, RNA STAR) not shared in the link.

Also, as I described initially and explained to Igor above, even when using hg38 and HISAT2 the fatal error after running featurecounts with a GTF from my history appears. Which is why I tried to use the built in genome option.

So, when I use the GTF from my history against hg38, featurecounts shows a fatal error (describe earlier) and when I try to use the built-in genome against hg38 the option is not available.

I appreciate your reply but unfortunately it does not solve my issue.

Best regards,

Adriana

Hi Adriana,
thank you for the shared history. Collection #9 (bowtie2) has PE reads mapped to hg38. I used featureCounts with built-in gene models on this collection with no issue. As Jennifer pointed out, other collections use ref genomes not compatible with built-in models.

The procedure:
find featureCounts in history panel, click at it
change Gene annotation file to featureCounts built-in
change type of input to collection and select collection #9
change Does the input have read pairs to Yes, paired end and count them as 1 single fragment.
hit Run tool
History with completed featureCounts job

Kind regards,
Igor

1 Like

Hi Igor,

Thank you. I tried it in the same order you describe and that seems to fix it, though strange that selecting the collection first and then the gene annotation file makes it “crash”. Anyway, thank you for your advice and time to help sort this out! Turns out it was easier than I thought. I do remain with the Fatal error issue when trying to use my own annotation, but at least for now using hg38 will not be a problem as I can now continue with the built-in one.

Also thank you Jennifer for your input, it is good to know now the compatibility with built-in models. I suppose this is the kind of things we beginners overlook.

Best regards,

Adriana

Hi Adriana,
I am sorry I forgot about the annotation file issue! Do you mean dataset #55 in the shared history? It does not look right. Click on name of the dataset. The brief description says: 1 line, 2,905,058 comments, while expected numbers should be something like 2,905,054 lines, 5 comments. It seems the file was poorly processed (during upload to Galaxy?), and all the content was ended up in 1st column as space separated values. Basically, it is not GTF file because of incorrect (missing) tab separators withing the text. The number of columns is very high for GTF/GFF. I don’t remember if I seen anything like this in Galaxy. I cannot think about an easy fix for it. I guess it is doable, but I rather get a proper annotation.

Maybe try GenCode annotation. Make sure you use annotation with chromosome name chr1, chr2 etc, not 1, 2 etc.

How did you uploaded the annotation file? If possible, try upload by URL, and avoid file editing on local machine if it runs under Windows.

Kind regards,
Igor