I wanted to download and use the cgMLST scheme for Listeria monocytogenes. I selected the correct organism, but received the scheme for Campylobacter jejuni instead. After checking the remaining schemes, I noticed that the first three or so appear to be correct, while the rest are either mixed up or replaced by organisms not included in the list at all. Listeria monocytogenes is completely missing.
As a workaround, I am downloading the scheme manually for now, but I wanted to flag this issue here in case it is an unknown problem.
This seems pretty significant! Do you think this is in just the indexes hosted in Galaxy or a problem upstream at the tool development step?
I only see one recent change that might lead to a mix up problem – overly long sequence names. These could get truncated. Please see the details here and let us know what you think!
You are also welcome to share you history for more specific feedback. If there is some issue with the tool wrapper, we can investigate and get it reported for a fix!
Oops, this looks indeed quite significant. The species IDs assumed by the Galaxy wrapper are (except for the first three, as you’ve already found @Anu ) all different from the ones here: Chewie Nomenclature Server
I’ll clarify this with the developers of the wrapper. Thanks a lot for reporting,
I’m fairly new to Galaxy and Bioinformatics, but I suspect this is the root of the problem (from the Chewie Nomenclature Server):
So I guess the IDs are now mixed up, and the tool is probably grabbing the wrong schemes because of it.
Thank you, Wolfgang, for reporting it on GitHub!