DESeq2 with HISAT2 output?

LaiaGutierrez · April 25, 2022, 9:01am

Hi everybody

I am working with smallRNA. I followed this protocol (Differential abundance testing of small RNAs).

My aim is to do differential expression analysis of siRNA and piRNA using DESeq2. In this protocol they use DESeq2 but for abundance estimation, which is not my aim.

There is another protocol specific for miRNA (Whole transcriptome analysis of Arabidopsis thaliana) which uses DESeq2 for differential expression analysis. I want to do the same but with smallRNA.

Problem: DESeq2 requires a count table, but I don’t have it. In the miRNA protocol, they did miRDeep2 to obtain it. But in the smallRNA protocol, they use HISAT2 against rRNA and miRNA and keep the reads not aligned, after they do transformation from BAM to FASTQ. However, DESeq2 does not accept this file as input, as it is not a count table.

Any ideas how I could obtain the count table as input for DESeq2 after doing HISAT2 to obtain the non-rRNA/miRNA reads?

Thank you!

igor · April 29, 2022, 5:49am

Hi @LaiaGutierrez
assuming you mapped reads to miRNAs idxstats reports number of mapped reads in 3d column.
Hope this helps.
Kind regards,
Igor

LaiaGutierrez · April 29, 2022, 6:19am

Hi @igor ,

Thanks for your reply. The problem is, I after running HISAT2 against rRNA and miRNA databases and filtering those non-rRNA/miRNA with BAM tools, I don’t obtain a table, but a file like this:

But DESeq2 asks me for a counts table.

Any idea? Thanks!

igor · April 29, 2022, 6:33am

Hi @Letizia_Iuffrida
I am sorry for unclear explanation. I mis-read your question. Do you have list of ‘targets’ for piRNAs and siRNAs? If yes, map the filtered reads (unaligned to rRNA and miRNA) to genome and count against the annotated tagets. If the targets are are in GTF/GFF format, try featureCounts. For intervals bedtools might be a good option, but filter multimappers before counting.
Kind regards,
Igor

igor · April 29, 2022, 6:44am

If you deal with repeated sequences, map reads to unique set of sequences, in the same way as for miRNA. Some miRNA loci produce identical mature miRNA, hence we map to mature miRNAs to avoid ambiguity of multimapping.
Kind regards,
Igor

Letizia_Iuffrida · April 29, 2022, 6:59am

Sorry,
I think you tagged me by mistake instead of @LaiaGutierrez

igor · April 29, 2022, 7:05am

oops, sorry! It is Friday evening here. Time for a break.

LaiaGutierrez · April 29, 2022, 7:54am

Thanks @igor,
I don’t have targets for piRNA and siRNA. I just want to identify them. Can I align them to the human genome?

Thanks!

igor · April 29, 2022, 8:36am

Yes, short reads can be aligned to the human genome. If you map with HiSAT2 maybe disable spliced alignment in Alignment options or use genomic aligners such as bowtie(2) or BWA. Short reads might map to multiple positions, especially if derived from mobile elements or repeats. Usually multimapped reads are ignored during counting step. For counting table you need a file with positions of regions of interest.

LaiaGutierrez · April 29, 2022, 11:09am

So, to sum up:

HISAT2 to align against miRNA and rRNA reference databases.
BAM to keep those that did not align.
Feature counts using the human genome (.GFF)
DESeq2

Right?

igor · April 29, 2022, 10:48pm

For featureCounts you need an annotation.
Maybe something like this:
filter out reads derived from miRNA and rRNA (map using HiSAT2 to miRBase and rRNA sequences and write unmapped reads to FASTQ files.
map these reads to genome
Identify regions of interest.
Maybe get a published protocol for piRNA and siRNA and reproduce all the steps in Galaxy. These RNAs might require different approach in identification. I believe piRNAs are usually longer, so consider filtering by length.
Kind regards,
Igor

Topic		Replies	Views
HISAT2 against miRNA in smallRNA protocol usegalaxy.eu support	1	630	April 29, 2022
miRNA seq differential analysis usegalaxy.eu support workflow	0	571	August 20, 2020
Error with DESeq2 "all samples have 0 counts for all genes"	1	2978	February 4, 2019
What file to use for Deseq2 and where to get it from usegalaxy.org support gtn-tutorial , tool-dev , salmon	3	676	March 19, 2021
Need some guidance, analyzing RNA seq for the first time usegalaxy.org support	1	638	December 24, 2019

DESeq2 with HISAT2 output?

Related topics