RNA alignment not detecting indels

Hello. Ive been trying to identifiy indels in the alignment of RNA through IGV. But I still can’t identify them. I don’t what to do anymore since in the exome i can identify them, but not in transcriptome. STAR doesn’t identify but Bowtie2 can. Is it possible to be a problem with STAR?


I need help. This is for my masters degree.

Welcome, @cmarques2000

If your data is RNA, you can compare to RNA (transcriptome) with a non-splice aware tool that is designed to capture and report indels. BWA or BWA-MEM are good tools choices for RNA-to-RNA or DNA-to-DNA mappings, and Bowtie2 may also work but there are some differences you could investigate/compare between the two.

When mapping RNA to DNA, you’ll need to use a splice-aware alignment tool. These can be tuned to capture indels if you have the “right” kind of read data. This is Q&A from the tool author about how-to → STAR indel mapping? · Issue #477 · alexdobin/STAR · GitHub

Hope this helps! :slight_smile:

The sample of RNA was aligned before with BWA and apparently the alignment was really bad. So I wented to try STAR but somehow in the places where i know that exist a indel there is not identifying it as an Indel. I’d say its because the sequence of the indel is similar to the genome reference one but for the exome it was used the hg19.
It should be deletecting in STAR and Bowtie2 but it doesnt give the confidence to affirm that it is actual an indel.

Hi @cmarques2000

Thanks for the extra comments. Strange about your results. Maybe align your reads to hg19 and hg38 with a few different tools, with different parameter sets, then load up all the BAMs and annotation files (CDS regions) into a genome browser (IGV, or UCSC) and inspect the results to see if you can figure out what is going on? Sort of a matrix of results that includes the raw data (reads) too?

If you can get access to the reads that were involved in annotating the indel, that is something to explore, too. I suppose it could be associated with a particular sample type, so not all sample groups of reads would contain it?

Just tossing out ideas to explore this more. :slight_smile:

I’d say that the problem would be from STAR not aligning correctly for some reason that i can’t figure out what it is. Also since the exome data that i have was aligned with hg19 i should still use it on RNA. So hg38 is out of the options.