I have an RNAseq workflow that was created from a history. The original history used a collection as input, and I can use the workflow to process collections without issue. However, I have one file that I want to put through the workflow as a single input file rather than a collection. I copid the original history, replaced the collection input with a single dataset input, and then deleted and re-established all of the workflow connections from left to right. When I run this workflow with mu single file input (named GS1683etcetc.fastq), the workflow seems to run. Strangely though, all of the output files (eg fastQC, featurecounts) list a different filename as the input (the filename, 601K1BBetc.fastq, is the first file in the input collection from the history that was used to generate the workflow). I don’t think the incorrect file was actually used, as the # of reads reported in fastQC matches up with the known number of reads from the correct input file. Any idea what I’ve done wrong here? Why do all of my output files from analyzing that sample return the wrong sample name?
1 Like
@eyoungman The original workflow was likely using dataset collection metadata to rename files as a “Post job action”. Preserving input file names is much easier to do when using collections versus individual datasets.
Try the original workflow and just have one input dataset included in the input “dataset collection”.
If you really want to run a workflow on a single dataset now, not a collection, one alternative is to add a #some-tag
to your single starting input dataset. Downstream tool and/or workflow outputs all have that same original #some-tag
added to them.
Thanks!