Can i extract determinate sequences from a FASTA proteome having the transcript_is's in galaxy

Nicolas_Romero_Villa · October 2, 2023, 2:05am

So i have a 42K sequence proteome from a non model organism and i need to substract 1000 sequences that i obtain from a differential expresión analysis in order to use eggNOG mapper to do the GO enrichment analysis can i do that on galaxy?

jennaj · October 2, 2023, 5:58pm

Hi @Nicolas_Romero_Villa

Do you mean that you want to subset the 42k protein sequences before running eggNOG?

Are those protein sequences annotated at all already? Associated with a Gene identifier?

Did the DE analysis involve using any known annotation? Meaning, transcript to gene associations are known?

If you have both, then you could use the gene to subset. But that could be limiting – you won’t find any potential novels since they wouldn’t be included in the subset. This would be true wherever you run these tools – data cannot be subset until you learn how they are associated (associated as orthologs in this use case).

You’ll need to run eggNOG on the entire set that you expect to be connected to the genes of interest, discover the ortholog groups, then filter after for the actual associations (per eggNOG) with your genes of interest (DE result).

Please explain more … am I misunderstanding the goal?

Topic		Replies	Views
eggNOG Mapper orthologous genes tool-help , eggnog_mapper_search , eggnog_mapper , eggnog_mapper_annotate	1	20	January 28, 2025
GO and KEGG analysis limma_voom	3	1177	March 22, 2023
protein sequence usegalaxy.eu support text-manipulation , igv , genome-annotation	1	13	March 7, 2025
What file to use for Deseq2 and where to get it from usegalaxy.org support gtn-tutorial , tool-dev , salmon	3	676	March 19, 2021
Extract new (alternative) transcripts from stringtie output	0	754	February 11, 2020

Can i extract determinate sequences from a FASTA proteome having the transcript_is's in galaxy

Related topics