Hi,
I want to create my own galaxy tool based on a python command line tool (READemption - A RNA-Seq Analysis Pipeline — READemption 2.0.0 documentation).
The tool is published on conda, pypi and has a docker image/container.
My question is what would be the best practice to implement this tool as a galaxy tool?
Specifically I wonder how the flow of input, intermediate results and output files would be managed.
The tool does RNA-seq analysis, is a command line tool and has various subcommands:
create: creates an input folder structure where users can put their input files (references sequences, annotation files and read files)
calling this command will create some input and output folders as follows:
READemption_analysis
├── config.json
├── input
│ ├── reads
│ ├── salmonella_annotations
│ └── salmonella_reference_sequences
└── output
└── align
├── alignments
├── index
├── processed_reads
├── reports_and_stats
│ ├── stats_data_json
│ └── version_log.txt
└── unaligned_reads
Now the user has to put their input files into the corresponding folders:
reference sequences go into READemption_analysis/input/salmonella_reference_sequences
annotation files go into READemption_analysis/input/salmonella_annotations
and reads to into READemption_analysis/input/reads
After providing the files, the user can run the first command ‘align’, which aligns the reads to the reference sequences and creates output files like statistics and BAM files. These output files are written to corresponding output folders (READemption_analysis/output/align/alignments or READemption_analysis/output/align/reports and stats)
The other subcommands perform gene quantification, create coverage files or run differential gene analysis (using DESeq2). These subcommands upon being called create their own outputfolders and write the result files into these outputfolders.
So my question is, how do I implement such a tool? How do I manage the input and output file flow? The tool actually does that by reading the corresponding input folders and for later subcommands the output folders, like the one where the bam files are saved. Can I simply write all subcommands (we use argparse) as commands in galaxy’s tool.xml file or do I have to implement our controller.py as a tool.xml or galaxy workflow? Is it possible to keep the input output folder structure concept when implementing the tool as a galaxy tool?
Any help is appreciated and if you need further information please let me know.
Best wishes,
Till