Job failure due to exceeding resources at public Galaxy servers -- Solution: Modify the job or consider a custom Galaxy server

earthworm · September 1, 2021, 6:07am

Hi,

I am trying to run blastn using a ~21MB .ffn file from the genes of a bacterial genome and a ~6GB database .fasta file consisting genes from multiple different bacterial genomes. However, I got this error as a result of job submission:

‘num_threads’ is currently ignored when ‘subject’ is specified.
/jetstream/scratch0/main/jobs/37541641/command.sh: line 120: 2997 Killed blastn -query ‘/jetstream/scratch0/main/jobs/37541641/inputs/dataset_60967048.dat’ -subject ‘/jetstream/scratch0/main/jobs/37541641/inputs/dataset_60968206.dat’ -task ‘blastn’ -evalue ‘0.001’ -out ‘/jetstream/scratch0/main/jobs/37541641/outputs/dataset_60990860.dat’ -outfmt ‘6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore qlen slen’ -num_threads “${GALAXY_SLOTS:-8}” -strand both -dust yes -parse_deflines

Has anyone had any experience on solving this error and submitting blastn jobs successfully?

Thanks!

jennaj · September 1, 2021, 10:14pm

Hi @earthworm

Thanks for sending in the bug report. I replied to that with more details specific to your situation, so below is just a summary for others reading.

What does this warning mean? ‘num_threads’ is currently ignored when ‘subject’ is specified.

That the Galaxy server you are working is configured to set the number of threads the job uses based on the content of a target database that is built-in (run through makeblastdb).
If a custom target fasta is used instead, that pre-set option is ignored. Why? For practical reasons determined by the authors of BLAST+. This warning message is reported as stderr by the underlying tool itself.
A custom target fasta can be run through makeblastdb in Galaxy (the tool is wrapped and available) but the indexing may not work well for all fasta formats (anywhere) or fasta datasets with a large number of individual sequences at public Galaxy servers (indexing job also exceeds resources). All BLAST+ tools are picky about format - queries and targets.

When a job is too large to run at a public server as submitted (for resource reasons), modifying the inputs or refining parameters can be potent solutions. The goal would be to reduce the memory a job needs, the time it takes to run, or to limit the output content so that the work does not exceed available computational resources.
Galaxy itself can handle even the largest of projects. However, very large, compute-intensive, and/or time-sensitive analysis projects are not appropriate for public Galaxy servers. Consider setting up a Galaxy server (personal, lab, institutional) and allocating sufficient resources. For any tool, the resources required to run a job in Galaxy are the same as running that same tool line command, and the underlying tool’s documentation is the best place to learn about tuning computing resources. Admins of a Galaxy server can also specify how attached computing resources are utilized (local or cloud).

More about setting up a Galaxy server for urgent or large work:

Webinar: Use Galaxy on the web, the cloud, and your laptop too >> Webinar: Use Galaxy on the web, the cloud, and your laptop too
Ways to use Galaxy: Galaxy Platform Directory: Servers, Clouds, and Deployable Resources (scroll down for the platform choices matrix)
The GVL version of Cloudman is one choice and AWS offers grants. AWS Programs for Research and Education

This forum can be searched with keywords like “cloud” or “gvl” to find prior Q&A about working with custom Galaxy servers. I also added some tags to your post that point to topics like that.

Hope that helps!

Topic		Replies	Views
NCBI+ blastp and GALAXY_SLOTS - increase number of threads server-admin , ncbi , galaxy-local , blast , blastp	3	508	December 19, 2023
Megablast jobs have been running for 21 days now usegalaxy.eu support workflow , toolshed	1	522	August 12, 2019
Problem to make a bacterial genome assembly database for blast troubleshooting , exceeds-memory-error	1	949	August 28, 2019
CheckM lineage_wf usegalaxy.org support tool-help , checkm_lineage_wf	3	33	February 4, 2025
Makeblastdb: Job output not returned from cluster error usegalaxy.eu support makeblastdb	3	599	February 3, 2022

Job failure due to exceeding resources at public Galaxy servers -- Solution: Modify the job or consider a custom Galaxy server

Related topics