error reported for fgsea - gene set enrichment tool

Hi
I am using fgsea for Gene Set Enrichment Analysis on mouse data.
As input a tabular file with column1 containing gene symbols, column2 containing LogFC sorted in descending order.
I tried both with gmt file and with Rds input downloaded from MSigDB and NCBI, respectively. Both runs fail and give the same warning:
Warning message:
In Sys.setlocale(“LC_MESSAGES”, “en_US.UTF-8”) :
OS reports request to set locale to “en_US.UTF-8” cannot be honored
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
line 1 did not have 3 elements
Calls: read.table → scan

A bit strange because there are only two elements required?
My history is this one:

Thanks your help and advise

Welcome @pacthoen

Thank you for explaining the tool and for sharing your history! Very helpful!

The tool is stating that it found the wrong number of “columns” in one of the files.

This is an R message, and sometimes the logic isn’t perfect, but we know we are looking for three columns in one of the files, maybe a file that should have three columns and doesn’t or shouldn’t have that many but does, that sort of problem.

In the error and one of the inputs files, it looks like the input file has two columns of data but then a header with “three words” (or “elements” or “terms”). So, a different number of column header words versus columns of data.

Then looking at the top of the tool form, we can read about what the expected input file format is.

Do you see the problem? The header line included a space, and the tool is now interpreting the header as describing three columns of data, then when it got to the first data line, it only found two columns. That conflict generated an error in R that is technically good at describing what is going on (even though it isn’t describing more about the overall context!).

This is pretty common with tools – it isn’t getting data that it is expecting, and it can trap the problem from a purely logical perspective. Any time you see a message about “numbers of lines” in an error log, that will indicate some format problem you’ll need to double check – your data compared to what the tool expected is different somehow.

It is common for tools to interpret all whitespace (spaces or tabs) as the same: the whitespace is separating terms (aka elements or words). Removing extra spaces so that each term is all oneWord is a good idea to avoid errors. You’ll need to do that with your file, then try again.

The only time you should includes spaces in terms is when that field is declared as a “descriptive text” term. Sometimes these are quoted, too. The tool form will usually highlight how to format data in the Help section (or link to a tool guide) since it is so important to know about. If you ever get stuck, you can ask here and we can help to check.

We have some data manipulation guides or you can just search the tool panel. A “find and replace” function would be a good choice for your file.

Hope this helps and let us know if not and we can look into your new error/data next. Thanks! :slight_smile: