I am trying to convert from BAM to FASTQ so that I can process the outputs through PEAR and merge collections but it is throwing this error;(Galaxy) /usr/local/bin/picard: line 5: warning: setlocale: LC_ALL: cannot change locale (en_US.UTF-8): No such file or directory
Picked up _JAVA_OPTIONS: -Xmx12G -Xms1G
Aug 19, 2024 4:03:26 PM com.intel.gkl.NativeLibraryLoader load
INFO: Loading libgkl_compression.so from jar:file:/usr/local/share/picard-3.1.1-0/picard.jar!/com/intel/gkl/native/libgkl_compression.so
Exception in thread “main” htsjdk.samtools.SAMFormatException: SAM validation error: ERROR::INVALID_MAPPING_QUALITY:Record 6, Read name 18783975, MAPQ should be 0 for unmapped read.
at htsjdk.samtools.SAMUtils.processValidationErrors(SAMUtils.java:460)
at htsjdk.samtools.BAMFileReader$BAMFileIterator.advance(BAMFileReader.java:865)
at htsjdk.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:850)
at htsjdk.samtools.BAMFileReader$BAMFileIterator.next(BAMFileReader.java:818)
at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:591)
at htsjdk.samtools.SamReader$AssertingIterator.next(SamReader.java:570)
at picard.sam.SamToFastq.doWork(SamToFastq.java:204)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:280)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:105)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:115)
Welcome, @Luke
Do you still need help? I think people had trouble helping since you didn’t share an example of the BAM file itself.
That said, this is the part that points to a formatting issue
So, compare to the BAM/SAM specification and see if you can notice a problem. Do the data lines even have a MAPQ value?
Just to be clear for everyone reading … this person is trying to use the SamToFastq tool. Link to tool at a UseGalaxy server.
Your workflow looks fine technically to me. This is assuming that the BAM passed through data that actually contains paired-end unmapped reads! As a guess – are you sure it isn’t empty? Meaning: all headers and no data lines at all? Or, maybe a better fit for your message, a truncated data line (starting in the middle of the 6th data line)?
Please let us know if you still need help. A shared history that contains some sample data run through your workflow would be a great help for us to give the best advice for getting this resolved!
I’ve attached the history below
Hi @Luke
Thanks for sharing the history. Your BAM data looks fine, and has data lines. Good! Sometimes this error indicates a problem there, which is why I was asking about that first.
For your case, it turns out the tool just wants a stricter BAM format than the data you provided, due to a particular parameter setting. This is not uncommon and can be adjusted.
Your current setting is “STRICT”. Screenshot of the rerun icon for the failed dataset.
Go to the workflow, open the editor, click on the tool, and adjust this:
Select validation stringency
You can try both of the other options to see which is needed to better fit your data.
Hope this helps but let us know what happens and we can follow up more if needed.