I am trying to create a workflow where the result of one program feeds into the next. I have SamtoFastq selected for Step 1, then I was hoping to feed this Fastq file into Trim Galore! But cannot because on the workflow canvas it shows that the SamtoFastq output is .txt instead of Fastq (see attached photo). What did I do wrong? Thanks.
@mlim: What version of the SamToFastq tool are you using in your workflow?
I’ll also rerun a test with the most current version: SamToFastq extract reads and qualities from SAM/BAM dataset and convert to fastq (Galaxy Version 2.18.2.1). Perhaps there was a regression in functionality. The tests will go back into the original test history: https://usegalaxy.org/u/jen/h/test-datatype-sam-to-fastq. If these fail (can reproduce the txt output), I’ll post back with a new issue ticket. This tool is included in a GTN tutorial so it is important for it to work correctly.
The output naming was corrected and tested successfully before. However, that tool version and the new versions have some problem. So, keep using the version you are using now and use the post-job action as @mvdbeek suggests, that is the best way to use the tool until the newer versions are fixed (a different problem than the completed txt-not-fastqsanger fix).
The alternative tool above may be more useful anyway. It has expanded input/output options, including but not limited to: 1) input bam, sam or cram data, coordinate sorted or not, 2) output fasta or fastqsanger (or the appropriate fastq sub-type), 3) output a compressed or uncompressed version of the data, 4) output paired-end reads in different ways (R1 only, R2 only, both R1+R2 in two distinct datasets, or R1+R2 interleaved in a single dataset), and 5) output has the appropriate datatype, assigned directly by the tool. Meaning, there is no need use Configure Output to re-assign the datatype when used in a workflow, and no need to re-assign/re-detect the datatype when used directly in a History.
Allowing Galaxy to assign the datatype will ensure that it is correct and matches the actual content of the user-specified output type. This avoids introducing unintentional mismatched “datatype” metadata problems that can lead to downstream tool errors.