Unassigned_Ambiguity problem in featureCounts

Hi, I’m new on processing raw data from RNA-seq experiments but my recent project requieres me to get it on.

So, I’m using the 12 runs from the ENSEMBL accession number: E-MTAB-4929;
I made the trimming using Cutadapt, also I had to figure out if the reads were stranded or not, so I used Infer Experiment for that and they are paired-end unstranded.
For the alignment I used RNA-STAR, I downloaded the reference genome from Genome NCBI (Glycine max (soybean)) as well as the annotation file in gff version 3 format.

My QC, made with MultiQC, from the alignment seem to me OK, here is a ss:

The problem is when I run the featureCounts;
my input files are the BAM files from the alignment and the anotation file gff version 3 of the Glycine max genome.
For the extra info., I followed the tutorial: Reference-based RNA-Seq data analysis

I made the QC on the summary file [one output file of featureCounts] and I obtained this:


I’m getting less than 50% of the total fragments mapped and I don’t know why :S
¿Can someone please help me?

Dear Nati2208,
The results of featurecounts does not show you the mapping statistics it shows you the counting done by feature count. That is to say, You have 50% of the reads that assign to at least two or more features in your annotation file, e.g., two or more transcripts of your gene, or two or more exons because they map to exon-intron boundaries.

I hope I could answer your questions.

Best wishes,
Florian

Thank you Flow,

I didn’t know, thank you for correcting me.

Is something I can do to have more assigned counts?

Dear Nati2208,
You can select in featureCount under advanced options different modes for ambiguitiy Allow reads to map to multiple features (.e.g, -O) together with the option Largest overlap to Yes. Test it and see what will happen.

Best wishes,
Florian

1 Like

Dear Flow,

I ran again featureCounts with those filters and I obtained these results:

Thank you so much for your advise!

1 Like