GFF file with UTR information

I have been using a Genbank-based GFF file to map E. coli RNA-seq reads. However, Genbank does not provide any UTR information. Are E. coli GFF files with UTR information available? If not, is there a way to extend annotated coding regions to include 5’ and 3’ UTRs using RNA-seq data? Thanks in advance.

Hi @cjain
The question is not related to Galaxy. Maybe talk to researchers working with bacteria. I assume you remember that (many) E.coli genes are organized in operons. Are you after leader sequences? I have not seen whole genome operon and/or transcript annotation when I worked with another bacterial species years ago. Search Google Scholar for whole genome operon annotation or something similar.
Kind regards,
Igor

Thanks for replying. However, I would still like to know whether there any tools that can use RNA-seq data as input and extend coding regions so that UTRs are included?

Hi @cjain

There are many tools in Galaxy that can adjust data points in tabular files.

Those in the group Operate on Genomic Intervals are a good place to start.

More general text manipulation tools are explored as a guide in GTN Materials Search (query=olympics).

And, you might want to explore a resource like the UCSC Table Browser. https://genome.ucsc.edu/