Panaroo & PanTA: sample recognition troubleshooting

Fernando_Martin · October 8, 2025, 4:59pm

Hi, I’m trying to use PanTA, Panaroo, and Roary with the dataset provided by the PanTA paper (Efficient inference of large prokaryotic pangenomes with PanTA). I took the three species files (a tar.gz file) and converted it to a .zip file to upload it to Galaxy.eu. There, I unzipped it using the ‘Unzip a file’ tool, and then I changed the data type of the resulting collection to GFF3.

I am able to pass this collection (with 600 GFF3 files) to each tool, but PanTA and Panaroo don’t work as expected. Panaroo outputs an empty list with 0 datasets, and PanTA says that there must be at least 2 samples, even though the input has about 600 GFF3 files.

Here is the link to the history I am using to test this: Galaxy

I would appreciate it if anyone could help me with this issue, as I am new to Galaxy.

Thanks in advance,

Fernando Martin Garcia

admin edit: fail - "a list with 0 datasets " & “Exception: There must be at least 2 samples” - Inputs not recognized?

jennaj · October 8, 2025, 6:03pm

Welcome @Fernando_Martin !!

I can’t see how the jobs were set up (they weren’t fully shared) but I can see the input collection (both copies). I wondering if the sample names (collection element identifiers) were truncated by the tools due to the format.

I’ve reorganized the samples with group tags (for the batch groups), and simplified element identifiers (sample names), plus pushed the full original name into a general purpose tag (optional “name” tag – you could modify this to be something else).

I’ve done that for the full batch, then created a downsample set for you to test with (5 samples).

Galaxy

Would you like to run some tests to see if the reorganization was enough to resolve the immediate sample interpretation error? I didn’t notice anything else special about the gff3 files – these look Ok to me from a first pass.

Then, if the job(s) fails again, and you don’t want to share the actual datasets in the history, please capture a screenshot of the job Details view so I can see the tool input/parameter configuration and the full job logs to replicate the job and examine the technical details. We can pull in the EU administrators if this is some server issue but I don’t think that is needed yet.

How to use this history: you can review or you can import to see what I’ve done with the Apply Rules tool and a few other Collection operations manipulations. Remember that you can copy datasets between histories (the top level collection automatically pulls in the elements/files), and use the rerun icon to bring up the original tool form, change the input, and run it yourself on any other subsets you want to try if the downsampling/tagging is interesting to you (and it works!). The icons for collections are at the top after clicking into the collection. Then purge anything not needed once done. Tutorials for these tools are linked at the bottom of the forms but please ask if anything is not clear and I can point you to specific resources.

Great! Let’s start there.

Update: I found one of the older jobs in the deleted set. I’ve started up a test run with the downsample reorganized collection to see what happens! All is in the same history above.

jennaj · October 8, 2025, 7:26pm

Hi @Fernando_Martin

Yes, the issue was with the collection identifiers. The last test in the shared history is still waiting to run but I’m expecting it to be successful, or to at least present with a different error (content, not format).

I’ve decided to make a request to correct the tool wrapper, if possible. → Bug: PanTA requires element identifiers ending in "gff". Conflicts with new default "Auto Build List" defaults (strips extensions) · Issue #7349 · galaxyproject/tools-iuc · GitHub

This topic has some advice about a similar issue (spaces in the identifiers) → Panaroo error - missing dependancies? - #6 by Nanobes

Workaround for end-users

Name your collection’s element identifers like this: sample.gff

all one term
no spaces
ending with a .gff

Not sure how to check your element identifiers or modify them? See this topic at the forum Panaroo error - missing dependancies? - #6 by Nanobes and search our tutorials with the keyword “collection” for more!

The exact process will be different for everyone, but the instructions above (extracting and directly applying new identifiers for those who prefer text manipulation utilities) and the examples in the shared history above (using another method) can get you started.

Collection 4331 with the rerun icon to bring up the Apply Rules tool

Example parsing. The List Identifiers is the core bit to make sure to assign. Other metadata can be broken out into tags if wanted.

GTN Tutorials → Using Galaxy and Managing your Data / Tutorial List

Hope this helps but please let us know if it actually does!

Once you get a copy of the testing history, let me know and I’ll purge it.

Fernando_Martin · October 8, 2025, 10:44pm

Hi @jennaj

Thank you very much for the quick and precise reply!

It works now and both Panaroo and PanTA seem to be working fine with the new identifiers, although I tend to do it a bit differently and wanted to ask if this approach is also okay for changing the identifiers?

Best regards,

Fernando Martin Garcia

jennaj · October 8, 2025, 11:03pm

Glad that helped, and the List Identifiers look good! I’m glad you got this going