I would like to compute the ExN50 statistic for a de novo transcriptome assembled with trinity.
Following the steps indicated on the trinity github page I, in order:
- Aligned reads back and estimated abundances (kallisto)
- Built expression matrix (kallisto)
- Computed the ExN50 stat
However, even if the transcriptome is composed by about 37.000 transcripts, the ExN50 stat is finally based on only about 27.000 transcripts. These correspond to the trinity gene to transcript mappings.
My questions are:
- How is it possible?
- Is that a reasonable “approximation” to compute this statistic?
Thank you for your help!