LEfSs uploded file looks different to 2018 – problem?

janinewaege · November 18, 2020, 4:37pm

Hello everyone, I already used LEfSe in 2018 and have now a new dataset to analyse. I noticed when I upload my .tsv file that it looks different when I view the data with the “view data” function (see pictures attached). In 2018 the table had a grey bar with numbers at the top and it looked like a proper table. Now my data looks like a text file. I was wondering if something is wrong with my file, although LEfSe doesn’t complain. My further analysis says there are no significant discriminative features in my 133 samples. I am not sure if I can trust the results now. Any help and/or advice is appreciated!

jennaj · November 18, 2020, 6:01pm

Hi @janinewaege

The first screenshot is what would be expected for a dataset with the datatype “tabular” or “tsv” assigned.

The second screenshot is what would be expected for a dataset with the datatype “txt” assigned. It is the default view for any plain text file (“tabular”, “txt”, plus others).

A very large dataset of many datatypes will revert back to the view shown in the second screenshot (“large” can differ by Galaxy server). That seems to be what is happening – the files are different sizes but both have the datatype “tabular” assigned? Meaning, dataset 47 has fewer lines than dataset 56 (?). That would explain the difference in the “View data” display. This really is “display” only and doesn’t change the actual contents of the file/dataset. If you look at the expanded dataset 56 “peek” view’ in the history panel, you can see that it was small enough to display as “tabular with the column numbers” since only the first 5 lines are used for that function (for practical/design reasons).

Note: A compressed dataset (of various types) might display as plain text (second screenshot) as well. This is not your situation but seemed worth clarifying for you (other display use-cases) or for anyone else reading this Q&A

How data displays differently (by datatype or size) is part of the larger application settings. That is unrelated to how tools function/execute or wouldn’t lead to any downstream differences analysis results – the dataset’s content/data itself isn’t modified, and that is what tools use as inputs.

You are working at this public server, correct? Galaxy | Hutlab Galaxy

The server is set up a bit differently than other public servers and hosts custom versions of tools, in particular the Lefse tool suite. Their support/admin team does not follow this forum, but there are other ways to contact them about odd or non-reproducible results. I don’t think that is what your issue is (unless those results are not reproducible, maybe run the analysis again and compare?). But if you have tool concerns or odd results with a rerun comparison, you could contact the Huttenhower support/admin team for more help, especially if you think there is a technical problem unrelated to the display differences. Please see this prior Q&A for how:

Thanks!

janinewaege · November 20, 2020, 9:48am

Thank you very much for the detailed answer. Both datasets got assigned
as “tabular” and yes dataset 42 had only 20 samples and dataset 56 had
133 samples. I was using the public sever
(https://huttenhower.sph.harvard.edu/galaxy/). Repeating the analysis
showed the same result.