How to filter output of GFFcompare (class_code i,u,x) and extract filtered transcripts in GTF file?

Hello! I am new to RNA-seq, but I have to conduct a transcriptome analysis and DEs of lncRNA.

Currently, I’m at the stage of discovering nоvel lncRNAs and facing some difficulties.

The tool GFFcompare generates several output files, but two of them are important to me - annotated transcripts and TMAP. I would like to filter GFFcompare output, so to get only transcripts with class_code “u, i, x” and exons>=2.

The TMAP file contains all the necessary attributes; I can filter by columns using “Filter data on any column using simple expressions” or Select lines that match an expression" or , but the file is tabular, and I need a GTF result. On the other hand, I can’t filter annotated transcripts file because It doesn’t contain all the attributes there.
My questions are:
How should I proceed, which GFFcompare output to choose, and which Galaxy filter tools should I use?
Is it possible that if I decide to filter the TMAP, I can then convert the tabular file to GTF?

I will be grateful for any guidance and advice you can provide!

Hi @Seahorse

The ways to do this I can think of. None are automatic but maybe helps anyway?

The filter needs to be against a file that contains all the “filter variables”.

That could be what you have now, or you might need to combine files first (joined together on a common key). Be careful about many-to-many joins.

To covert to GFF, the tabular data needs to have reference coordinates. And after converting, you will probably need to adjust at least the 9th column (“attributes”) to have the content any downstream tools are expecting. If you just want to view the file in a browser (IGV, UCSC) then that part may not matter.

Some tools to consider, plus more covered in Data Manipulation Olympics

  • Select or filter tools
  • gffread
  • BED-to-GFF converter

Hope that helps!

Dear @jennaj , Thank you very much for the advice and the suggested data manipulation tools, which will definitely be helpful to me.

1 Like