Hi all,
I created a bash script that runs a tool and then it takes some of the output files put them in a directory that I’ve created for this tool and copy them into some bash variables:
cp ${output_dir}/file1.txt $2
cp ${output_dir}/file2.txt $3
Therefore in the XML tool definition file that runs this bash script I specify this outputs:
So I force Galaxy to put this scripts in a directory created by me. Is this wrong? Or should I let Galaxy put the output files in the default directory?
This is a wrapper in bash that runs my python tool (mytool.py) that takes one input and produces two outputs. Then it puts this outputs in a folder created for this tool. And I force Galaxy to put the outputs in this folder. In fact after I run this tool in Galaxy I have all the files in this folder “/results/output-tools/MYTOOL/sessions/${session_dir}” and not in the default folder in which Galaxy stores the outputs one /database/files. Is this ok or it may cause some sort of problem since I am running a Galaxy instance with a lot of tools that I integrated in this way on a server?
#!/bin/sh
# run program
run_output=$(python mytool.py --input $1)
# get session name
session_dir=$(echo "${run_output}" | grep "Session:" | sed "s/^Session:\ //")
# folder with the session name
output_dir=`"/results/output-tools/MYTOOL/sessions/${session_dir}"`
# check if the output files exist
for f in "file1" "file2"; do
if [ ! -e ${output_dir}/$f ]; then
exit 1
fi
done
cp ${output_dir}/file1.txt $2
cp ${output_dir}file2.txt $3
But if I remove the folder, the output files will still show up in Galaxy?
The fact is that my tool put all the outputs in a specific dir and i need Galaxy to retrieve them in that specific folder. But as you said I have the the data in the galaxy folder and in my own folder. Maybe I should just put all the files where the script runs by moving them out of the session folder I created and let galaxy deal with them.
Yes, but you remove the folder after you moved or copied it to galaxy. So it will be the very last line of the bash script. $2 and $3 are basically paths.
Seems that the bash wrapper only calls python (+ some post processing). In my opinion you don’t need this bash wrapper script but you should code is directly in the tool’s command block.
Like @gbbio I would suggest that you should modify the python script (assuming its your code), to just output to a configurable directory, e.g python mytool.py --input $input --output output. The reason is that a central directory like /results/output-tools/MYTOOL/sessions/ (in your example) does not exist on other systems and complicates installation.
But I completely disagree that tempfolder=$(mktemp -d /galaxy/galaxy/database/files/XXXXXX) is a good idea. First of all /galaxy/galaxy/database/files/ might not exist on all systems and more importantly database/files/ is the central location of Galaxys files, one is not supposed to write into this. Also the directory is writable only by the Galaxy system user, i.e. writing won’t work if Galaxy runs jobs as real user.
The solution is simple. Each Galaxy job already runs in its own temporary directory (called job working directory). So just create a temporary directory (e.g. mkdir outdir) in the current working dir. and use this. Galaxy will take care of cleanup after the job finished.