Troubleshooting an error (exit code 1) when performing BWA-MEM2 on WGS data

Hello,

I attempted to input two WGS data sets (two different sets of reads) and a reference genome from NCBI into the BWA-MEM2 tool. The job failed and resulting in the following messages:

Execution resulted in the following messages:
Fatal error: Exit code 1 ()

Tool generated the following standard error:
Looking to launch executable “/usr/local/bin/bwa-mem2.avx2”, simd = .avx2
Launching executable “/usr/local/bin/bwa-mem2.avx2”
[bwa_index] Pack FASTA… 22.12 sec
** Entering FMI_search*
init ticks = 173525096758
ref seq len = 5422419662
binary seq ticks = 119852397722
Allocation of 40.40 GB for suffix_array failed.
Current Allocation = 45.45 GB

If it helps troubleshoot this issue, I have included some information about the job and the parameters used below.

BWA-MEM2 on data 1, data 8, and data 5 (mapped reads in BAM format)
Dbkey: bosTau9
Format: bam

Galaxy Tool ID: toolshed.g2.bx.psu.edu/repos/iuc/bwa_mem2/bwa_mem2/2.2.1+galaxy0

Tool Parameters
Will you select a reference genome from your history or use a built-in index?
History

Use the following dataset as the reference sequence
Data 5: fasta.gz, 1957 sequences, 801.4 MB, bosTau9

Single or Paired-end reads
Paired

Select first set of reads
Data 8: fastqsanger.gz, 25.2GB, bosTau9

Select second set of reads
Data 1: fastqsanger.gz, 25.7 GB, bosTau9

Enter mean, standard deviation, max, and min for insert lengths.
150

Set read groups information?
Do not set

Select analysis mode
Illumina

BAM sorting mode
Sort by chromosomal coordinates

Job Resource Parameters
No

Job Metrics
Cores Allocated: 10
Memory Allocated (MB) 29995
Job Runtime (Wall Clock): 6 minutes

meminfo
Total System Memory: 28.6 GB
Total System Swap: 0 bytes

Hi @edf393

The FAQs explain what you can do in general to continue working at a public Galaxy server. In short: use an indexed genome as the mapping target (bosTau8 instead) + split the query reads into a collection, run some QA, map, then merge the results.

Or, you can consider setting up a private Galaxy server with scaled up resources.

Details:

  • The job failed for exceeding memory limits at the public Galaxy server FAQ.
  • Specifically:
    • The fastq datasets are too large – ~25 GB compressed per end. FAQ
    • The fasta custom genome is too large, plus has a formatting problem. FAQ