Workflow automation?

wormball · April 8, 2021, 4:16pm

Hello!

I have to fill parameters manually into my workflow which are mostly file names and sample names. I have to select from about 7 files about 20 times and enter similar but slightly different sample names (e. g. wn0552.A7CE75785, wn0552.A7CE75785_tumor, wn0552.A7CE75785_normal) for another 20 times. As you may guess, it is time consuming, error prone and in general not exactly the kind of work i dreamed of.

But i suspect i can automate this process. If it were linux command line, all i need were echo, cat, ln etc., and some rarely used ascii symbols. However in galaxy i found only cat.

So essentially i want:

Tool that receives one file and sends it to the output intact so i could connect its output to all steps using some specific file and set its name only once. In linux it may be cat or cp, but i would prefer ln or even variable definition to save disk space and time.
Tool that receives some string and output it intact like echo.
Tool that receives two string and concatenates them like echo $a$b . So i could set my sample name to e. g. wn0552.A7CE75785, derive other important strings from it and use these strings in all my steps using “add connection to module”.
And also tool that gives the number of processors on current machine to set the number of threads in some tools.

Is this possible in galaxy? Or maybe there is an easier way to automate workflows? Or i have to write wrappers for all these tools myself?

Thanks in advance.

mvdbeek · April 9, 2021, 7:09am

Please check out Using Workflow Parameters, I think it covers all of this, minus the last part. If tools can use multiple threads or cores they can consume $GALAXY_SLOTS, which is set as appropriate based on the resources the admin has configured.

wormball · April 9, 2021, 9:07am

Thanks Marius! It is exactly what i needed.

wormball · April 9, 2021, 9:39am

Add Compose text parameter value tool to the workflow
Add two more “Components” using the “Insert components” button
Add the Regex Find And Replace

Unfortunately i can not find any of the mentioned entities.

wormball · April 9, 2021, 10:37am

I found “compose_text_param” tool in the toolshed. I think it should be mentioned in the article that this tool is not shipped out of the box and should be installed.

And one more question. I created “input dataset” for the reference genome, but MergeBamAlignment (unlike other tools) does not accept it as input! However it perfectly accepts the genome file from history but i have to do it for three times per run. Is there a way to make MergeBamAlignment accept the output of another step as the reference genome? Or is there another way to define the reference genome (including that for the other tools if possible)?

jennaj · April 9, 2021, 4:33pm

Hi @wormball – this tool has an option to accept a dataset in fasta format. This can be from an upstream tool or an input dataset.

You might need to disconnect all tools, then reconnect them starting from the input through the downstream tools, in the order of processing. I’m guessing that the tool had the setting originally to use a natively indexed genome, then it was changed to use a fasta from the history. Reconnecting the “noodles” resets the workflow’s metadata. If the genome is already an input dataset, you can also connect that input to multiple tools that need to incorporate it.

wormball · April 12, 2021, 3:31pm

I undefined the data type of my reference genome input field (it was “fasta”) and it allowed me to connect it to MergeBamAlignment (despite MergeBamAlignment wants fasta as i can see). Thanks!

Topic		Replies	Views
Using txt file as a parameter for text input field workflow	6	774	June 6, 2023
Using a config file as input to workflow to set workflow parameters and other tool settings workflow , simple-inputs	1	468	August 22, 2019
Conditional logic in Galaxy workflow	0	310	January 26, 2021
Recursive Workflow workflow	1	95	April 5, 2024
Params format for running workflow from API	1	443	September 26, 2019

Workflow automation?

Related topics