I ran busco to evaluate the draft of a bacterial genome and got this error:
“FileNotFoundError: [Errno 2] No such file or directory: ‘/cvmfs/data.galaxyproject.org/byhand/busco/v5/lineages/bacilli_odb10/refseq_db.faa.gz’”
So, instead, try one of the other combinations. Some can be unexpected. But if a database “isn’t available” in the error message, that means the underlying tool cannot process data that way.
Hum, if you can show an example of where that worked before, I’d be willing to take a look.
Now, the prior versions with simplified form all required a linage selection (most like the first example in dataset 8-9 above) so off hand that may have been it. But now, with the compound form, I think it is all working as expected.
Hello again, in the post you mentioned above, there is something that doesn’t make sense.
Prokaryotic data is used, but the ortholog database used is from plants (liliopsida).
BUSCO works well if you use version 5.8.0+galaxy0.
Therefore, the error does not stem from the absence of orthologous protein databases for each lineage, but rather from the version of the program used. It is clear that the latest version does not work.
I was just setting the default combinations at the top level to show what would be produced – my example isn’t a scientific example, more about the settings. I should have worded this better! I’ll try again.
You have bacterial data, correct? Using Metaeuk is the only requirement, the remainder you can set as needed. That can be “auto” for the linage as I used – but you can also choose a linage. In short, using Miniprot won’t work with the “auto” choice because it only functions with a linage selection.
Hi @jennaj, thank you very much for the clarification.
Just to add some context from my side: I am also working with a prokaryotic (bacterial) genome in genome mode, and I have tested both:auto-lineage, and manual lineage selection (bacteria_odb10). In both cases, BUSCO fails with the same error:
I previously observed the same type of error for archaea_odb10 during auto-lineage as well. This makes me suspect that, on this Galaxy instance (Galaxy AU), the BUSCO lineage reference files themselves may be missing or incomplete, rather than the issue being related only to the MetaEuk/Miniprot or auto/manual lineage combination. I understand your point about the UI combinations and predictors, but in this case the error consistently points to the absence of the refseq_db.faa.gz file for prokaryotic lineages.
Just sharing this in case it helps diagnose a possible server-side database issue affecting prokaryotic BUSCO runs.
I just checked the data at UseGalaxy.org and everything was working as expected. The UseGalaxy.org.au server should be using this same data through a shared CVMFS repository.
Screenshot of some of the test jobs with parameters tagged (you can explore the shared history for exact details).
Important keys: please notice how the Select a gene predictor must be set to metaeuk to allow the selection of the bacterial lineage. All prokaryotic lineages will require this same gene predictor.
For eukaryotic, you can use either of the predictors.
This is a good question so I’m glad you asked again, and I understand your point about suspecting that the gene predictor setting is unexpectedly specific!
But this is known and intentional for now – at the UseGalaxy public servers, the computed indexes for prokaryotic genomes are only available for use with the metaeuk option. This leads to Prodigal being used (technically!). Bacterial lineage index for miniprot are not available at this time.
The tool is a bit complicated with all of the options and the comprehensive indexes! Hopefully this explains what is going on.. but does this actually help?
Screenshot from the shared history, with the history panel’s datasets displayed, and the rerun form for dataset 14 shown. ↩︎
Screenshot from the Busco 5.8.0+galaxy1 tool form showing option Select a gene predictor. Tool tip: In the case of a prokaryotic genome, Prodigal is the default gene predictor. ↩︎
Thanks for your reply @jennaj. It worked when I selected Augustus, but yesterday i try selecting metaeuk and miniprot I got the error again. Thanks for your explanation of the subject. Hope is helpful for other users in the future. Cheers