Fool seeks aid for Lefsa analysis in Galaxy

Hi all,

I’m having some trouble with analysing my sequencing results using Lefsa, I’ve been trying to get my head around all the ways in which I’m getting it wrong but honestly I keep just hitting road blocks, whether it’s Galaxy, Conda, Python or R.

I think (hope) it’s down to my actual data structure.

My primary data file looks like this:

while my metadata file is structured like so:
image

is there anything glaringly wrong with my data structure here that I need to change?

thank you!

Hi @dan_ja

Mostly a guess → labels in the first file are like this “High_Protein” and labels in the second file are like this “High Protein”. And, “KD10__” versus “KD10”. Then, “Sample” versus “NAME”. Then, “Condition” versus “Protein_group”.

Tools that are merging data between files want exact matches for the labels. Plus, R tools don’t like values that include spaces, odd characters, or that start with a number. So – all OneWord, not starting with a number, and only use underscores (optional) as One_Word for compound names.

In Galaxy, the tools have an extra component added in that can smooth that naming out, but it is impossible to be perfect about that, especially values common between different files, so simplify the naming yourself if trouble comes up.

I’d start with addressing that first. :slight_smile: