Hi all,
I’m encountering a frustrating and seemingly random issue when using the Concatenate Datasets tool in Galaxy to merge fastqsanger.gz
files from two different lanes.
Context:
- I have a series of lane-specific fastqsanger.gz files, and I want to concatenate them lane-wise.
- To do this, I’ve placed the two lanes into two dataset collections and launch parallel jobs for each pair.
- The tool runs 22 parallel jobs, each combining two
.gz
fastq files.
Problem:
- Some of the tasks fail silently. They appear to succeed (green status, file size = sum of inputs), but:
- Galaxy cannot recognize them as valid fastqsanger.gz files.
- Downloading and inspecting them shows that they are corrupted binary files, with lots of null bytes (
0x00
) and do not start with1F 8B
, which is expected for valid gzip files.
- Most importantly, the failures are random:
- Running the exact same collections multiple times yields a different set of failed outputs each time.
- This suggests it’s not related to the input files or tool parameters.
Observation:
I carefully checked the job command lines and noticed a pattern:
The working jobs use paths like:
python /cvmfs/main.galaxyproject.org/galaxy/tools/filters/catWrapper.py \
'/jetstream2/scratch/main/jobs/.../outputs/...dat' \
'/jetstream2/scratch/main/jobs/.../inputs/...dat' ...
The failing jobs use paths like:
python /cvmfs/main.galaxyproject.org/galaxy/tools/filters/catWrapper.py \
'/corral4/main/jobs/.../outputs/...dat' \
'/corral4/main/objects/.../dataset_....dat' ...
I learned that jetstream2
is a compute environment (Texas Advanced Computing Center), and corral4
is a storage backend. It seems that job dispatching between these environments is random, which could explain the random failures.
I also tried all 4 available “Concatenate Datasets” tools, and none resolved the issue.
My Questions:
- Why does concatenation fail when inputs come from
corral4
, but succeed fromjetstream
? - Is this a known bug related to dataset mounting/caching between compute nodes and object store?
- Is there a workaround (e.g., force copying files to scratch before concatenation)?
This issue is blocking my pipeline, and I’d really appreciate any help or suggestions on how to proceed.
Thanks in advance!