Help with interpreting GO enrichment resutls using goseq

ding66 · October 11, 2023, 6:35pm

Hi there! I’m new to RNA-Seq results analysis. By following the tutorial of " Reference-based RNA-Seq data analysis" (Reference-based RNA-Seq data analysis), I have so far completed mapping, annotation, differential expression analysis. Now the next step that I want to do is gene enrichment analysis with GO. I followed the steps in the tutorial by using the “goseq” tool. But although I was able to obtain and understand the results of " Ranked category list - Wallenius method" and " Top over-represented GO terms plot", I couldn’t understand the results of the " DE genes for categories (GO/KEGG terms)". Specifically, although the ranked category list telling me that I have x number of genes associated with different GO terms (with adj. p value < 0.05), I could not find gene_id associated with any specific GO term; All I can find in the “DE genes for categories” file was some strange number separated by “NA”… I have attached a figure of those results for one of my dataset (BP vs FPNP). Can somebody please show me if there is a way to find the gene_id that was associated with each GO terms listed in the ranked category list (my target organism is an E. coli strain)? Many thanks ahead.
DE genes for categories (?)

Ranked category list

The GO annotation file that I uploaded to goseq

jennaj · October 11, 2023, 9:11pm

Hi @ding66

Compare this:

List of genes with the true/false
Your “The GO annotation file that I uploaded to goseq” file
Do both use the same gene ID format?

If there is a mismatch, that is likely the problem. You would need to adjust the file in step 2 above to match your existing list of genes from step 1. A “replace” tool could do that if you can find the mapping. Data Manipulation Olympics

I might be guessing wrong, but you can share all the inputs and we can look at this closer.

ding66 · October 11, 2023, 10:18pm

Hi Jenn,

Thanks very much for your response and suggestions. I double checked and I do believe that my “gene ID and DE” file (please see the picture below) and the “GO Annotation file” are a match…

Gene ID and DE file:

ding66 · October 11, 2023, 10:20pm

Hi Jenn,

Thanks very much for your response and suggestions. I double checked and I do believe that my “gene ID and DE” file (please see the picture below) and the “GO Annotation file” are a match…

Gene ID and DE file:

jennaj · October 11, 2023, 11:46pm

Thanks @ding66 for posting those details.

I agree – those look like a match!

The only other thing I can think of is that maybe the underscore is causing some “match up” problem. We know that for other use cases that a dot character can be problematic (Ensemble gene terms with a version), and while that usually causes a mismatch that presents in a slightly different way … you could run a quick test with that underscore character removed from all of the inputs to see what happens.

For example … change GL980_000001 to GL980000001.

That can be done in batch on all files with one of the replace tools, or with sed.

Replacing the _ with nothing using sed would use this string: s/_//

Maybe try that first, and if it doesn’t work, you can share back your history and I’ll take a look at it and try to figure out what might be going wrong that way.

You can post the link back here, or I can start up a direct message you can share it in. Please try to make that history as simple as possible e.g. put a copy of the inputs into a new history, run the _ manipulations, then the tool, and share that.

ding66 · October 12, 2023, 1:44am

Thanks for your suggestion Jenn! I have tried your method by removing the dash in the gene_id. However, the problem seems to have remained… How do I share the history with you?

jennaj · October 12, 2023, 9:28pm

The link to the FAQ is in here

Topic		Replies	Views
Goseq for Gene IDs and differential expression output transcriptomics , goseq	1	213	September 28, 2023
Goseq NA NA NA NA values usegalaxy.org support troubleshooting , transcriptomics , goseq	13	90	September 8, 2024
Using GOSEQ with a custom category input transcriptomics , goseq	6	457	September 14, 2023
GO and KEGG analysis limma_voom	3	1187	March 22, 2023
Goseq gene catagories usegalaxy.org support transcriptomics , goseq	0	364	December 22, 2022

Help with interpreting GO enrichment resutls using goseq

Related topics