Extracting data from a .gff3 file

Afternoon, hopefully a simple question. I want to extract data from a .gff3 file and add it to the relevant gene id in a list of genes contained in a separate file, however the gene id in the .gff3 file is amongst other data (in column 9). How do I link to just the gene-id data in that column 9? I’m sure I did something similar in a tutorial ages ago but can’t find it!

Hi @kate2

I would suggest trying this!

First, convert your GFF3 to GTF format with gffread. Then, test to see if GTF2GeneList will extract the values you want pulled out from the attributes (9th column).

A tutorial has an example in this step (maybe what you used before!) → :graduation_cap: GTN Hands-on: Generating a single cell matrix using Alevin / Generating a single cell matrix using Alevin / Single Cell (generate-a-transcript-to-gene-map)

You likely know this already, but we have more manipulations in these tutorials!

Please let us know how this goes or if you need more help! :slight_smile:

Thanks for your help - actually I ended up doing it in Excel, but will re-try with your suggestion as it’s good practice for the future. Really useful to have a bookmark to these tutorials, thanks.

Great, glad you found a way! :slight_smile: