EGSEA help - Error in plotHeatMapsLogFC

Hello guys,

I am running EGSEA, and it was working fine until I started to run into errors.

I have the file in tabular format.

Geneid FFSSCLCR3 FFSSCLCR2 FFSSCLCR1 iPSC1 iPSC2 iPSC3
2018 2867.300115 4384.063805 3697.321171 0 0 0
196047 2139.832269 3254.124667 2938.558938 0 0 3.544768899
27063 6174.085968 2953.349501 6630.032231 0 7.378716416 5.317153348

And have a factor as tabular format

Sample Cell Type
FFSSCLCR3 FFSSCLC
FFSSCLCR2 FFSSCLC
FFSSCLCR1 FFSSCLC
iPSC1 iPSC
iPSC2 iPSC
iPSC3 iPSC

and my annotation as tabular as well

ENTREZID SYMBOL
100287102 DDX11L1
653635 WASH7P
102466751 MIR6859-1
100302278 MIR1302-2
645520 FAM138A
79501 OR4F5
729737 LOC729737
102725121 DDX11L17

But… when I tried to run EGSEA, it says
groupGOTerms: GOBPTerm, GOMFTerm, GOCCTerm environments built.
EGSEA analysis has started
Log fold changes are estimated using limma package …
limma DE analysis is carried out …
EGSEA is running on the provided data and h collection
…globaltestcamera.ora*
EGSEA is running on the provided data and c5 collection
…cameraglobaltestora*
EGSEA is running on the provided data and gsdbgo collection
…globaltestcamera.ora*
EGSEA is running on the provided data and kegg collection
…globaltestcamera.ora*
EGSEA analysis took 12.34 seconds.
EGSEA analysis has completed
EGSEA HTML report is being generated …
Report pages and figures are being generated for the h collection …
Heat maps are being generated for top-ranked gene sets
based on logFC …
Error in plotHeatMapsLogFC(gene.sets = gsets.top, fc = logFC, limma.tops = limma.tops, :
All featureIDs in the gs.annot list should map to
a valid gene symbol
Calls: egsea.cnt → egsea

and tried to troubleshoot, but I gave up after three days of trying…
can anyone help me to figure out what I am doing wrong here?

Thank you in advance

1 Like

Hello guys,

Could it be because I am using hg38?
Should I try with hg19 and run EGSEA again?

I tried to remove all “NA,” “LOC###,” and ‘XXX-AS’ genes from my list. It still doesn’t run, and I am running out of idea how to make it run.

Thank you,

1 Like

I am experiencing same problem. Did you manage to sort it?

1 Like

@Filip_Filipsky @incho

The problem is reported in the error message here:

Tool form instructions:

Symbols Mapping file

A file containing the Gene Symbol for each Entrez Gene ID. The first column must be the Entrez Gene IDs and the second column must be the Gene Symbols. It is used for the heatmap visualization. The number of rows should match that of the Counts Matrix.

Checking that both datasets contain the same number of rows is a basic check, but you could also compare the IDs and make sure they are a match. You might also need to adjust the header lines in the files – these deviate from the example data.

Some tools have built-in expectations for how the data are labeled in headers that the Galaxy wrapper around the tool cannot auto-adjust. This can lead to spurious error messages. So, I’d recommend following the same header labeling as in the examples just to eliminate that from being a problem, too. The sample names can differ of course, but columns of data representing gene/symbol information could have standardized labels in the first row.

Using the most current version of any tool is also usually important, to capture bug fixes and wrapper enhancements, and to make further troubleshooting easier. If you are not working at a usegalaxy.* server, you might want to try the tool at one of those as a comparison. UseGalaxy.org and UseGalaxy.eu are the best choices – UseGalaxy.org.au usually follows the same server (but not cluster) configuration as UseGalaxy.eu so testing/comparing between those is not normally needed.

If the tool still fails after those adjustments are made/confirmed, please confirm:

  1. The URL of the usegalaxy.* server where you tested.
  2. The complete tool name and version – find this info at the top of the tool form.

Let’s start there. We may ask for a history share link to help more, and that can be posted back if you are not concerned about keeping the data private, or let us know if you do want to keep it private and a moderator will start a private message thread for sharing. If there is some bug with the tool, we can help to sort that out versus a usage issue.

1 Like

Jennaj,

Thank you for the reply. I ended up ditching the dataset. However, I will follow your recommendation and try to reanalyzed the data using the same header as the example.
I first thought that I was getting the errors because of NAs and LOC### and so on.
Hopefully, it is just the header.

I will update the thread for @Filip_Filipsky too.

Thanks,

1 Like