Error with TrimGalore - Fatal error: Exit code 141 ()

I have a paired collection of paired-end fastqsanger.gz files which I am trying to run through TrimGalore! using the default settings. However, after running for several hours, one of the jobs hits the following error: Fatal error: Exit code 141 ()

Multicore support not enabled. Proceeding with single-core trimming.
Path to Cutadapt set as: 'cutadapt' (default)
Cutadapt seems to be working fine (tested command 'cutadapt --version')
Cutadapt version: 2.3
single-core operation.
Output will be written into the directory: /corral4/main/jobs/040/441/40441407/working/

Attempting to auto-detect adapter type from the first 1 million sequences of the first file (>> input_1.fastq.gz <<)

Found perfect matches for the following adapter sequences:
Adapter type	Count	Sequence	Sequences analysed	Percentage
Illumina	8075	AGATCGGAAGAGC	1000000	0.81
Nextera	1493	CTGTCTCTTATA	1000000	0.15
smallRNA	11	TGGAATTCTCGG	1000000	0.00
Using Illumina adapter for trimming (count: 8075). Second best hit was Nextera (count: 1493)

Writing report to '/corral4/main/jobs/040/441/40441407/working/input_1.fastq.gz_trimming_report.txt'

Input filename: input_1.fastq.gz
Trimming mode: paired-end
Trim Galore version: 0.6.3
Cutadapt version: 2.3
Number of cores used for trimming: 1
Quality Phred score cutoff: 20
Quality encoding type selected: ASCII+33
Adapter sequence: 'AGATCGGAAGAGC' (Illumina TruSeq, Sanger iPCR; auto-detected)
Maximum trimming error rate: 0.1 (default)
Minimum required adapter overlap (stringency): 1 bp
Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp
Output file(s) will be GZIP compressed

Cutadapt seems to be fairly up-to-date (version 2.3). Setting -j 1
Writing final adapter and quality trimmed output to input_1_trimmed.fq.gz

  >>> Now performing quality (cutoff '-q 20') and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file input_1.fastq.gz <<< 
10000000 sequences processed
20000000 sequences processed
30000000 sequences processed
40000000 sequences processed
50000000 sequences processed
60000000 sequences processed
gzip: short write

It’s not obvious to me where the error is coming from. Any insight is appreciated.

1 Like

Hi @Danielle_Ireland

This error probably means one of these is going on.

  1. The job ran out of resources (memory) during execution.
  2. The input fastq dataset is truncated: not fully loaded to Galaxy or the file already had problems before loading to Galaxy.

For item 1: Try a rerun. The cluster node this job was running on may have been running another larger job. To have that new dataset result replace the old error dataset result: navigate to the errored dataset, use the “rerun” double circle icon. The tool form will have a new option, right above the submit button, to replace with the rerun – choose that then submit.

For item 2: If that rerun fails, examine the original fastq dataset closer. Does the size appear to be correct (matches the size where it came from)? Can you run other tools against it successfully (example: FastQC)? You could also try to uncompress it in Galaxy (pencil icon > convert > uncompress) – if that fails, it is another clue that something is wrong.

In summary:

  • If the file is truncated, you’ll need to find the full dataset and upload it.
  • If the file is not truncated, and still fails for exceeding resources, review the FastQC report. You might need to customize the Trim Galore! settings or use a different tool.

Note: Sometimes there is a problem with incompatible compression formats, and uncompressing data before loading data to Galaxy is needed, but that doesn’t seem to be your issue.

Alternative tools include these:

  • Cutadapt Remove adapter sequences from FASTQ/FASTA
  • fastp - fast all-in-one preprocessing for FASTQ files
  • Trimmomatic flexible read trimming tool for Illumina NGS data