obtain 3'UTR sequences from assembled transcripts

Hi,
I want to obtain 3’UTR sequences from my assembled transcripts.

What tool should I use?

or is it possible to add the following tools on the online Galaxy server:
ExUTR (GitHub - huangzixia/ExUTR: ExUTR is a practical and powerful tool that enables rapid genome-wide 3'-UTR prediction from massive RNA-Seq data) or
CodAn (GitHub - pedronachtigall/CodAn: CDS prediction in transcripts).

Hi @s316052000
maybe have a look at TransDecoder. Activate all outputs. Try it with ‘output only longest ORF option’. It can produce GFF and BED files. GFF has annotation of UTRs, and you can get positions of UTRs from BED file, as well. Sequences can be extracted with any FASTA extraction tool, like getfastabed.
You probably will miss transcripts with short ORFs and some complicated cases, such as polycistronic transcripts, might be difficult to process correctly. Also, keep in mind that assembled transcripts from some tools, such as Trinity, can be in both strands, so if you use BED, make sure you consider orientation of ORF while selection 3’ UTR.
Hope it does make sense.
Kind regards,
Igor

1 Like