@gbbio May have more ideas (is a pro at troubleshooting!) … but I would suggest posting back the first two fastq reads from both the forward and the reverse inputs to help with that. Quote the content to preserve the format. Also note the currently assigned datatype (copy/paste to ensure it is exact).
The tool is picky about the input sequence and quality score lines. We can help with any other fix-ups or the appropriate parameters that might work with the data “as-is”.
Illumina reads come in a few different “flavours” – it depends on the version. The end goal will be to get to the point where all is in a standardized
fastqsanger datatype variant. And to make sure the assigned datatype/compression is a match for the actual dataset content. Most tools require fastq data to be in a
fastqsanger variant format – mostly due to how the quality scores will be interpreted. If the quality score scaling is not what is expected, tools can error in odd ways, or not output data at all, but that is usually easily resolved.
One important part about content is that both inputs contain the same base “read” names. If there are unpaired reads in either, that will cause problems. QA steps can remove one end of a pair. Some (example:
Trimmomatic) will split those out – four outputs – 2 for those that remain paired (forward + reverse), 1 for unpaired forward, and 1 for unpaired reverse. Tools in the group
Seqtk can also be used to filter out any unpaired reads – some tools are single manipulation functions, and one (tool:
Seqtk seq) can apply multiple manipulations all at the same time.
Also, be sure that you are using the most current version of the tool. At https://usegalaxy.org, this will be: FASTQ joiner on paired end reads (Galaxy Version 18.104.22.168+galaxy0)
The first few FAQs here may also help. Review if you want to – all of it is good reference info that helps to resolve the most common input problems. https://galaxyproject.org/support/#getting-inputs-right