Need some guidance, analyzing RNA seq for the first time

tbilly93 · December 21, 2019, 6:34pm

Can someone explain how does HISAT2 works? I’m currently processing an unstranded, pair-ended data set that has about 20 million reads or so after trimming. After HISAT2 with mm10 reference genome, the output is about 10 million reads, where did the other half ago? What’s weird is when I run htseqcount the counts add up to 20million.

Followup questions: ideally what should I be expecting, it seems like there are alot of reads that are falling under the no feature category, which I’m assuming is not that good.

1.Category	2.HISAT2 on data 2 and data 1: aligned reads (BAM)
__no_feature	8905861
__ambiguous	84369
__too_low_aQual	0
__not_aligned	5158843
__alignment_not_unique	6342605

Also if anyone can give me feedback on my pipeline and possible improvements
I use trimmomatic and FASTQC
then HISTAT2 to output BAM
then htseq_count for DeSeq2
Are there any thing I should be cautious or some crucial step I’m leaving out?

Thanks for all the advice and help!

amir · December 24, 2019, 1:39pm

for your first part of question i suggest you to use Fastqc on your Hisat2 results, and then use Multiqc to observer what happened with your data after aligning

the second part depends on your purpose, and what you want to achieve form your analyze your data.

Topic		Replies	Views
HISAT2 output error mapping , tool-help , hisat2	3	126	October 21, 2024
Error with HISAT2 usegalaxy.org support transcriptomics	3	22	April 3, 2025
HISAT2 job killed due to not enough memory allocated for the job usegalaxy.org support mapping , exceeds-memory-error	5	652	March 19, 2024
below 50 percents of reads are assigned in featurecount usegalaxy.org support	15	1141	June 13, 2021
Attempting to use MultiQC to visualize alignment with HISAT2 but the HISAT2 output collection does not appear usegalaxy.org support multiqc , collections	4	1072	August 24, 2021

Need some guidance, analyzing RNA seq for the first time

Related topics