Generating a New GFF file from a previously merged Stringtie File

I have an out there question. The genome I am working with has a very poor annotation, so on the first round of sequencing samples I ran Stringtie to identify novel transcripts, and as anticipated, after BLASTing the sequences of a lot of these transcripts they were actually genes. So I am done that work to correlate MSTRG numbers to genes. I now have a new set of samples and would like to use the new merged annotation I created as the reference annotation, but still have it search for novel transcripts. The issue is when I do that it reassignes the original MSTRG numbers and I no longer have a file that correlates the sequences to genes.

I need to either create a new GFF replacing the original MSTRG with the genes I identified, OR need to find a way to have Stringtie NOT reassign MSTRG numbers.

Does anyone have any suggestions on how to approach this, either option?

Thank you!


To use what you have already assigned, you would need to create a new reference annotation file, then incorporate it into the analysis at the usual step where known annotation is used.

Or, you could decide to assign/map new identifiers after the analysis is completed. Meaning, do your mapping after you have DE results.

The second is probably easier, but you could review a tool like this if you want to try the first → Maker genome annotation pipeline