Hi all,
Is it possible to use galaxy GOseq if I am dealing with RNA-seq analysis for bacterial specie, which is not included in the list of the Gene categories?
I uploaded its gtf.file > Use a categories file from history, but I got an error after 3 minutes!!!
Has anyone faced the same issue,? thanks
Hi @Seraph
GOseq will not understand how to use a GTF annotation. Plus, that file type is usually doesn’t contain ontology information. But if yours does, that data can be parsed into the format the tool is expecting.
See the help section on the tool form to learn the content/format of accepted custom inputs. The first column should match the Gene ID used in the other inputs. The second column are the GO/KEGG terms.
Gene categories file
This tool can get GO and KEGG categories for some genomes. The three GO categories are GO:MF (Molecular Function - molecular activities of gene products), GO:CC (Cellular Component - where gene products are active), GO:BP (Biological Process - pathways and larger processes made up of the activities of multiple gene products). If your genome is not available, you will also need a file describing the membership of genes in categories. The category file should have two columns with an optional header row. with Gene ID in the first column and category identifier in the second column. As the mapping between categories and genes is usually many-to-many, this table will usually have multiple rows with the same Gene ID and category identifier.
Example:
ENSG00000162526 GO:0000003
ENSG00000198648 GO:0000278
ENSG00000112312 GO:0000278
ENSG00000174442 GO:0000278
ENSG00000108953 GO:0000278
Help for data manipulations: Data Manipulation Olympics
Hope that helps