Go-seq Not listed Gene categories

Hi all,

Is it possible to use galaxy GOseq if I am dealing with RNA-seq analysis for bacterial specie, which is not included in the list of the Gene categories?

I uploaded its gtf.file > Use a categories file from history, but I got an error after 3 minutes!!!

Has anyone faced the same issue,? thanks

Hi @Seraph

GOseq will not understand how to use a GTF annotation. Plus, that file type is usually doesn’t contain ontology information. But if yours does, that data can be parsed into the format the tool is expecting.

See the help section on the tool form to learn the content/format of accepted custom inputs. The first column should match the Gene ID used in the other inputs. The second column are the GO/KEGG terms.

Gene categories file

This tool can get GO and KEGG categories for some genomes. The three GO categories are GO:MF (Molecular Function - molecular activities of gene products), GO:CC (Cellular Component - where gene products are active), GO:BP (Biological Process - pathways and larger processes made up of the activities of multiple gene products). If your genome is not available, you will also need a file describing the membership of genes in categories. The category file should have two columns with an optional header row. with Gene ID in the first column and category identifier in the second column. As the mapping between categories and genes is usually many-to-many, this table will usually have multiple rows with the same Gene ID and category identifier.

Example:

ENSG00000162526 GO:0000003
ENSG00000198648 GO:0000278
ENSG00000112312 GO:0000278
ENSG00000174442 GO:0000278
ENSG00000108953 GO:0000278

Help for data manipulations: Data Manipulation Olympics

Hope that helps :slight_smile: