Toubleshooting BiG-Scape output

cldrozd · January 18, 2025, 12:50am

Hello,

I am trying to use BiG-SCAPE on the galaxy.eu server. When I upload antiSMASH results for entire genomes, in gbk format, the tool runs successfully, but the results do not make sense. Despite many BGCs in the genomes, the tool only detects like 1 BGC and 1 BGC family from each genome. Has anyone been able to run this tool on gbks of entire antiSMASH outputs?

Thank you for any help you can share.

jennaj · January 21, 2025, 10:06pm

Welcome, @cldrozd

I have a small test case here that sounds like it is set up similar to how yours was run. We can review exactly how it is working today.

https://usegalaxy.eu/u/jenj/h/test-bigscape-jan-2025

That will take some time to run. Maybe you can notice where the problem is in this simple case, or notice something technical between your antiSMASH results and this data? We can dig into this more, this is just to get us started.

If you have a smaller test example to share back, too, that would also help. How to share work is explained in the banner topic of this forum. Smaller yet representative is best.

Update: I decided to pull the original genome sequences for those same genomes and run AntiSMASH on them in Galaxy. I’ll then run Big-scape on those results. This will provide a direct comparison between public known annotation versus predicted when run through these two tools.

https://usegalaxy.eu/u/jenj/h/test-antismash-then-bigscape

Thanks!

Doolittle_doeslittle · January 31, 2026, 3:00pm

I am having the exact same issue. Many BGC’s detected by antiSMASH in each of my genomes, and only 1 per genome and 1 per family predicted by BiG-SCAPE in my results.

jennaj · February 3, 2026, 1:07am

Hi @Doolittle_doeslittle

Did you adjust your anchoring set of PFam domains? You can test those by specifically asking the tool to consider one or more that resulted from the upstream tool. Adjusting other parameters can help as well. We won’t be able to help with the paramaters here too much, but the guide can be found at the developer’s site:

Home · medema-group/BiG-SCAPE Wiki · GitHub

The Galaxy implementation is still under experimentation, and only available at a single server, but it should still work “about” the same! I added some more to the original testing history above with some of more defaults plus a custom anchor file (with a text PF query against the raw files as a comparison for scope). The failures in the history are all due to parameter issues plus sparse-content issues.

My running idea list: I think the tool form could make the significance of the anchors much more important – maybe moved to the top of the form – since the output scope will always rely on this. Other ideas… detailed logs should be output by default, the HMM reference could be pre–loaded as a native index, more items could be warnings – or local to a sample within a larger collection grouping, and the MIBiG database use should probably be a on-by-default requirement (unless the HMM database input minimum changes).

If you have ideas, you are welcome to drop them into this forum topic or, more direct, you could open an issue ticket at the IUC repository for your “nice-to-have feedback” list. Working with scientists to tune tools is really important for our project!

tools-iuc/tools/bigscape at main · galaxyproject/tools-iuc · GitHub
Example of a prior reported item where I asked the developers for a specific change. In short, you don’t have to explain how to adjust, but what you want different with an example, then they will consider it and engineer it in, if possible. → Enhancement: allow BiG-SCAPE to process inputs without a required .gbk extension in dataset filename · Issue #6015 · galaxyproject/tools-iuc · GitHub

Those issues are not for error troubleshooting. So, if you run into what looks like an error instead, we can vet that here. Seeing the example helps!

Thanks!