Earlier i use to use gtf-to-gff feature to convert the file but now i cannot find the option anymore. cab anyone please help me with this. i have my RNAseq file in gtf format and i wanted to convert it to gff file.
Hello @dekhangmigmar9
Do you want a GFF
result or a GFF3
result?
If GFF
, then a GTF
dataset already meets the content/format specification. The GFT
format is a stricter version of GFF
. The difference is the content of the 9th field (“group” versus “attributes”).
If GFF3
, then you can try the tool gffread
as an alternative. There are choices to make regarding the data to to include in the output – exploring those options and reviewing the different possible results will be important.
Just to make it more confusing, many will name/label GFF3
annotation as simply GFF
…and there are hybrid GFF/GTF
annotation datasets commonly available. Sometimes the more specific annotation is not needed by a tool – example: the tool may just be interpreting feature coordinates, not labels.
Review the three different format types in the first FAQ linked below. These three are all distinct datatypes at a technical level. Tools may be expecting a particular format AND content for supplied reference annotation. GFF
and GTF
are more closely related format-wise to each other than both are to GFF3
.
Also, in many use-cases, it is better to use a GTF
dataset with RNA-seq tools. Why? More are designed to parse GTF
annotation than GFF3
– and you (usually) want to use the same exact reference annotation for all steps in the same analysis. This second FAQ below is related to differential expression analysis, but the annotation format and related help apply to many other analysis protocols.
FAQs: https://galaxyproject.org/support/#getting-inputs-right
Hope that helps!
I have the GTF data from the database for h19 transcriptome reference data. I would like to convert the format to GFF. I use to use GTF-to-GFF feature present at the left side of the galaxy tool box. But the feature is not available now, may be because it is updated and removed the feature which i am not sure. How is using GFF and GFF3 is different? As I want to align the RNAseq data of my sample to the H19 transcriptome data that i have to count the transcript reads.
Hello @dekhangmigmar9 ,
As @jennaj mentioned, the GTF file format already meets the content/format specification for GFF, so I believe you should be able to use the GTF file as a GFF input if that is your goal. If you wish to change the datatype in Galaxy, you can do so via the “Edit attributes” button on the history item:
You can read more info on GFF3 format in the URL posted by Jen: https://galaxyproject.org/learn/datatypes/#gff3