I hope you are doing well. I need some help from you and would be very grateful for your assistance. I am currently pursuing my PhD and working on RNA transcriptomic analysis using PacBio data. I have obtained all mitochondrial transcripts from the transcriptome data and compared them to the reference mitochondrial genome using BLAST and minimap2 software. After sorting the transcripts by length (alignment length), my question is: how can I **
Blockquote
determine the primary transcripts (3?), intermediate transcripts, and mature transcripts (mitochondrial rRNA, tRNA, and mRNA)?
Blockquote
**
If I select the top 3 transcripts based on alignment length, would they be considered primary transcripts? If they are primary transcripts, what about the intermediate and mature transcripts? Kindly guide me, as I am confused.
Thank you for your time and assistance. I look forward to hearing from you.
Not necessarily. In a very rough sense, the “longest” transcript might be used to represent a gene, but that is a computational shortcut, and not based on biological “truth”. UCSC has a guide here that you will probably find helpful. → https://genome.ucsc.edu/FAQ/FAQgenes.html. Notice the public curated database links – those projects will define how they label transcripts, and the computational tools used.
Then maybe also look at expression analysis protocols? If you can find a publication that does something similar that is probably the best way to learn about the approaches people use. This is why I included the other link to the isoform expression analysis above.
For the public protocols, you’ll likely find similar tools in Galaxy, and you can of course reference a publication or tool and ask about specifics when translating a protocol (or designing your own, and you want to do some step in Galaxy but can’t find the similar tool).