Is there a protocol for miRNA Sequencing?

LaiaGutierrez · January 20, 2022, 10:04am

Hi,

I am sequencing miRNA and other small RNA. I found a protocol to obtain the counts for smallRNA. However, this involves getting rid of miRNAs. I wanted to know if there is another protocol to obtain the miRNAs.

In this protocol for smallRNA they also align the reads to reference sequences from Drosophila. I am working with humans. I would like to know where I can download these annotation and reference files of miRNA and rRNA.

Thank you in advanced,

Laia

Flow · January 20, 2022, 3:02pm

Dear @LaiaGutierrez,
I am not certain, which training material you have followed, but you can potentially get the counts for your miRNA, if you have the annotation file for your miRNAs. I assume you have done something like a feature counting with a bed or gtf/gff file that contained smallRNA regions. Thus if you would replace this file by a file with miRNAs, then you get the counts.

You can obtain annotations for dm6 (Drosophila) from UCSC or Ensembl.

Kind regards,
Florian

LaiaGutierrez · January 20, 2022, 4:36pm

Thanks @Flow ,

I still haven’t analyzed the miRNA because I didn’t know where to find the annotation file.
I used this tutorial: Differential abundance testing of small RNAs

I think I have to align my reads to a miRNA reference sequences file. But I am not sure where to find these reference sequences. Is it the same as the annotaton file?

In my case I am working with humans. I attach a picture, so you can tell me if this is what I have to download. In the protocol I followed they use an annotation file and two differne reference sequences, one for rRNA and one for miRNA. I don’t know where to find this.

I also don’t know what exactly is the difference between these files (BED, GTF/GFF…).

I am very new to bioinformatics and I have trouble with some concepts!

Thanks!

Flow · January 25, 2022, 12:29pm

Dear @LaiaGutierrez,

I think I have to align my reads to a miRNA reference sequences file.

Yes and no. You have three possibilities. (A) Map only to the reference genome. (B) Map only to the miRNA sequences from a known database. (C) Combine A & B. Option (B) Relies on the database; thus it is quicker, but it might give a misleading count because of multi-mapped reads, and it does not cover unknown miRNA transcripts. I am not 100% familiar with the detailed differences but maybe a combination (option C) is a better approach in your case for human, as mentioned in this article.

But I am not sure where to find these reference sequences.

One of the biggest miRNA databases is miRBase.

Is it the same as the annotaton file?

No. A reference file for alignemnt is typically in FASTA format. An annotation file usually is in gtf/gff and used, e.g., for counting and an overlap analysis.

In my case I am working with humans. I attach a picture, so you can tell me if this is what I have to download. In the protocol I followed they use an annotation file and two differne reference sequences, one for rRNA and one for miRNA. I don’t know where to find this.

I think for human it is better to go to Ensembl and download the file Gene sets, which you can filter for miRNAs.

I also don’t know what exactly is the difference between these files (BED, GTF/GFF…).

You can find a description of various file formats here.

Cheers,
Florian