Error with HTseq RNAseq read count

Hi,

I am getting error while running HTseq. This is the command and the error:

htseq-count -q -f bam -s yes Ac1_mapped/ac1_mappedAligned.bam /global/home/users/catalinacastro/star/genome/genomic_v2.gtf

count.txt

Error occurred when processing GFF file (line 637338 of file /global/home/users/catalinacastro/star/genome/genomic_v2.gtf): not enough values to unpack (expected 9, got 1) [Exception type: ValueError, raised in init.py:221]

Thanks, Catalina

Welcome, @Catalina_Castro

This part of the message is reporting that there is a problem with the reference annotation.

What to do:

  1. Double check that the GTF file was completely uploaded

  2. Review the content. Is the file actually in GTF format?

  3. Then review the content. Do the chromosome identifiers in the GTF match the chromosome identifiers in your count files?

    • all inputs need to be based on the same reference genome assembly
    • chr1, Chr1, and 1 can mean the same thing to a person, but not to a tool
  4. Does the GTF have any headers? If so, try removing them since some tools cannot parse around them.

    • Technically, strict GTF never has headers but data providers include them anyway for provenance information. Whenever you suspect those are causing problems with a tool, try removing them to see what happens.
  5. If you have a GFF3 reference annotation instead, convert it to GTF format with the tool gffread. The tool Htseq-count does not understand GFF3.

  6. I know this seems tedious but everyone has to do the same. Getting the reference data correct is super important, and if wrong, may not even fail a tool but instead produce scientifically problematic results that are not so easy to detect.

    • Tools are matching up identifiers between files, and comparing genomic coordinates. Making sure that all data is based on the same assembly, and labeled in a consistent way, ensures accurate results.

FAQs

If you can’t find the problem, we can try to help more here.

:mechanic: What information should I include when reporting a problem?

Any persistent problems can be reported in a new question for community help. Be sure to provide enough context so others can review the situation exactly and quickly offer advice.

Consider Sharing your History or posting content from the Job Information :information_source: view as described in Troubleshooting errors.