Panaroo error - missing dependancies?

Hello,

I was trying to run Panaroo on a collection of gff files, however I kept getting the same error - not being able to find the input directory. It sounds like there is a disconnect between the gff collection and the tool rather than an issue with the input data itself? Any fixes or workarounds for this? Below is the run info screenshot:

Hi @Nanobes ,

Please, report the error to the server support using the following procedure: click at any output from the failed job, click at Error icon, the one looking like ladybird beetle, and in the middle window click at Report button. Do not delete output of the failed job, when you report it to the server support. Error reports give us access to job set and log files.

Kind regards,

Igor

Hi Igor,

Thank you for your help. I’ve tried to find the error report icon, but it does not show up when I select the job output:
Am I looking in the wrong area?

Hi @Nanobes ,

I am sorry for unclear instruction. Collections require extra click(s). Click at Info ( i ) icon. in the middle window scroll down to Tool Outputs section. If you see any output file, not a collection, click at it, and you’ll see the Error icon. If you see only collections, rerun the job and add log file to outputs.

Panaroo has very high failure rate on Galaxy Australia, but it passes tests, and I see completed jobs from users.

Kind regards,

Igor

Hi @Nanobes,

If you don’t have a Panaroo log file in your history, re-run the job and change Output log file to Yes. Click at the log file from the failed job, click at Error icon.

Kind regards,

Igor

Thanks Igor,
There was no output file so I’m glad you mentioned the re-run with log option. I’ve done that and submitted the error report now.

Hi @Nanobes,

It seems the error was caused by dataset names in the input collection. Click at Info ( i ) icon at the log file from the failed Panaroo job. In the middle window scroll down to Tool Standard Output log file and expand the black box. The last line is:
FileNotFoundError: [Errno 2] No such file or directory: ‘input_directory/*.gff’

Note .gff at the end. Datasets in the input collection have names like Prokka on data XXX_ gff. I suspect [space character]gff is not recognised as a valid suffix. I changed names of three files from the input collection to ProkkaXXX.gff (where XXX is a number from the original name), created a new collection and completed Panaroo job.

You probably can change names of datasets in the collection with the following steps (tools in Collection Operations section):

1 Extract element identifiers

2 Change the identifiers to something like ProkkaXXX.gff using text replacement or manipulation tools

3 Paste datasets with original and modified names side by side (paste files side by side)

4 Use Relabel Identifiers with two column file and change names in the collection.

Hope that helps.

Kind regards,

Igor

2 Likes

Hi Igor,

Perfect! It now works. Thank you for your assistance!

1 Like