below 50 percents of reads are assigned in featurecount

mmomeni · June 1, 2021, 10:00am

Hi.
I am analyzing a human RNAseq single-end unstranded dataset. I have used the hg38 build-in index file in HISAT2 as a reference genome and the FeatureCounts build-in GTF file of hg38 as the GTF file in FeatureCounts. But only under 40% of reads are assigned in FeatureCounts. I Changed the GTF file many times and tried different GTF files but it was not effective. Now, what should I do to elevate the percentage of assignments? Is anything wrong with this dataset? Should not I use this dataset?

David · June 1, 2021, 10:13am

Welcome!
Which tools are you using?

mmomeni · June 1, 2021, 10:28am

Hi. HISAT2 for mapping and featurecounts for annotation assignment.

David · June 1, 2021, 11:27am

Sorry, I mean before HISAT2; quality-checking/trimming?

mmomeni · June 1, 2021, 5:09pm

FASTQC for quality checking and Cutadapt for trimming.

David · June 1, 2021, 6:00pm

Ok.
So, assuming you have high-quality reads, how was the HISAT2 alignment reports? Can you show it here?

mmomeni · June 6, 2021, 1:37pm

Please accept my apologies for my delayed answer. For the multiQC report, I had to have summary reports of HISAT2 and I performed mapping by HISAT2 again to obtain summary files. Here is the MultiQC report:

Sample Name	% Aligned
HISAT2 on data 447_ Mapping summary	90.2%
HISAT2 on data 449_ Mapping summary	86.9%
HISAT2 on data 451_ Mapping summary	88.0%
HISAT2 on data 453_ Mapping summary	86.7%
HISAT2 on data 455_ Mapping summary	86.2%
HISAT2 on data 457_ Mapping summary	85.9%
HISAT2 on data 459_ Mapping summary	87.2%
HISAT2 on data 461_ Mapping summary	88.0%
HISAT2 on data 463_ Mapping summary	86.6%
HISAT2 on data 465_ Mapping summary	85.0%
HISAT2 on data 467_ Mapping summary	86.8%
HISAT2 on data 469_ Mapping summary	86.0%
HISAT2 on data 471_ Mapping summary	80.2%
HISAT2 on data 473_ Mapping summary	81.6%
HISAT2 on data 475_ Mapping summary	88.8%
HISAT2 on data 477_ Mapping summary	89.9%
HISAT2 on data 479_ Mapping summary	86.8%
HISAT2 on data 481_ Mapping summary	85.5%
HISAT2 on data 483_ Mapping summary	87.3%
HISAT2 on data 485_ Mapping summary	83.0%
HISAT2 on data 487_ Mapping summary	88.0%
HISAT2 on data 489_ Mapping summary	87.1%
HISAT2 on data 491_ Mapping summary	84.8%
HISAT2 on data 493_ Mapping summary	85.8%
HISAT2 on data 495_ Mapping summary	83.1%
HISAT2 on data 497_ Mapping summary	87.9%
HISAT2 on data 499_ Mapping summary	89.1%
HISAT2 on data 501_ Mapping summary	86.7%
HISAT2 on data 503_ Mapping summary	88.7%
HISAT2 on data 505_ Mapping summary	88.8%
HISAT2 on data 507_ Mapping summary	86.9%
HISAT2 on data 509_ Mapping summary	85.1%
HISAT2 on data 511_ Mapping summary	85.7%
HISAT2 on data 513_ Mapping summary	87.7%
HISAT2 on data 515_ Mapping summary	88.2%
HISAT2 on data 517_ Mapping summary	87.3%
HISAT2 on data 519_ Mapping summary	85.5%
HISAT2 on data 521_ Mapping summary	87.0%
HISAT2 on data 523_ Mapping summary	86.1%
HISAT2 on data 525_ Mapping summary	88.8%
HISAT2 on data 527_ Mapping summary	88.9%
HISAT2 on data 529_ Mapping summary	84.6%
HISAT2 on data 531_ Mapping summary	85.1%
HISAT2 on data 533_ Mapping summary	83.8%
HISAT2 on data 535_ Mapping summary	88.9%
HISAT2 on data 537_ Mapping summary	88.4%
HISAT2 on data 539_ Mapping summary	85.9%
HISAT2 on data 541_ Mapping summary	88.0%
HISAT2 on data 543_ Mapping summary	88.9%
HISAT2 on data 545_ Mapping summary	88.7%
HISAT2 on data 547_ Mapping summary	86.3%
HISAT2 on data 549_ Mapping summary	87.2%
HISAT2 on data 551_ Mapping summary	86.3%
HISAT2 on data 553_ Mapping summary	85.6%
HISAT2 on data 555_ Mapping summary	87.2%
HISAT2 on data 557_ Mapping summary	90.4%

mmomeni · June 6, 2021, 5:01pm

And these are fastQC and Cutadapts reports prepaired by MultiQC.

David · June 6, 2021, 6:14pm

@mmomeni,
Mapping summary seems good, so I’d try check if your

reads are mapped on regions that are not found in your annotation.

Can you share what featurecounts’ parameters are you using? Maybe you can tweak some options related to stringency/overlap/quality, like the “Allow reads to map to multiple features”, "Minimum mapping quality per read", "Minimum fraction (of read) overlapping a feature", etc…

mmomeni · June 7, 2021, 9:29am

No, I do not change any options you mentioned. All these parameters are set by default.

mmomeni · June 10, 2021, 11:14am

Hello again. So what do you think about the problem? Can I use the results of this RNAseq dataset?

David · June 10, 2021, 3:18pm

Hello, @mmomeni. I’d still like to see your feedbacks about:

Have you read this?

And

So you’re not worried about trying to refine for (possible) better parameters?

mmomeni · June 11, 2021, 6:24am

Yes, I have read that but it did not help me because the cause of un assignment, in that case, was NoFeature and Ambiguity.
As you see below the most unassigned reads are because of multi mapping:

By considering this which of the mentioned parameters should be changed?

David · June 11, 2021, 2:08pm

Now that’s a good report.

Have you tried this?

mmomeni · June 13, 2021, 12:12pm

Now I tried it and the percentages of reads assigned became a little better(38-60 percent). But something strange happened the percentage of Unassigned_NoFeatures increased a lot! What is the reason for that?

David · June 13, 2021, 2:08pm

I guess can’t give you any better help,
Have you compared all the options from Allow reads to map to multiple features?

Topic		Replies	Views
Featurecounts in built genome giving no read assignments mapping , transcriptomics , featurecounts	3	470	February 16, 2023
How can I improve very low assigned rate in featureCounts? usegalaxy.org support	10	9645	March 11, 2019
Need some guidance, analyzing RNA seq for the first time usegalaxy.org support	1	641	December 24, 2019
Low HISAT2 alignment rate and low featurecounts assigned rate usegalaxy.org.au support transcriptomics	1	240	April 23, 2024
In featurecounts I got 69% assigned but count matrix full of zeros usegalaxy.org support usegalaxyorg , gtn-tutorial , htseq-count , reference-genome	4	621	November 4, 2021

below 50 percents of reads are assigned in featurecount

Related topics