How should I store files using original file names and extensions when I upload them

Bioinfo1980 · February 6, 2019, 1:49pm

Hello,
I am developing a new galaxy tool using R as main programming language.

When I upload a number of txt files with the Upload tool in Get Data, I noticed that it stores the files in galaxy/database/files/000 directory and changes the file names from the originalname.txt into dataset_number .dat. How can I set up Upload tool in order to store the file with the originalname.txt format.

Thank you very much in advance for your help.

mvdbeek · February 6, 2019, 2:55pm

This isn’t possible, Galaxy may store files in object stores that don’t resemble classical file systems.
The correct way is to reference files in your tool using $file.element_identifier.

Bioinfo1980 · February 6, 2019, 3:11pm

Thank you for your response.

My problem is that the R package I am using to develop the tool requires a set of filenames and a location of the files and it generates an ExpressionSet based on those files. If the filenames in the file directory are different from the original once I don’t know how to tell to the function what file to integrate.

Could you suggest me a solution?

Thank you

mvdbeek · February 6, 2019, 3:31pm

You can symlink this in your command section to a known location. This is a very common pattern in Galaxy tools. So ln -s $file input1.tab or something like that, and then use input1.tab in your tool. If the name is important you can use ln -s '$file' '${re.sub('[^\w\-_]', '_', $file.element_identifier)'.tab

Bioinfo1980 · February 8, 2019, 9:18am

Thank you mvdbeek,
I tried to add the command you suggested in the command tag of my tool.xml file and it works if I run it for one file. However, if I iterate the command to create a symlink for 4 files using a for statement the loop stucks. I did not find any example showing the application of loops in the command statement. That’s could be the cause?

This is my code:

<command>
for ( i=1; i<=4; i++ ); do
  ln -s '${ARRAY[$i]}' '${ARRAY[$i]}.element_identifier';
done
</command>

Thank you for your precious help

mvdbeek · February 8, 2019, 10:04am

The language in the <command/> tag is cheetah, there are many examples in the documentation, one is here.
The documentation is pretty dense, you might want to follow along the planemo tutorial first (https://planemo.readthedocs.io/en/latest/writing_standalone.html and specifically https://planemo.readthedocs.io/en/latest/writing_advanced.html#processing-lists-reductions)

Bioinfo1980 · February 11, 2019, 10:39am

Thank you for your precious help!!!

I resolved the issue as follows:

<![CDATA[
#for $input in $LIST
ln -sf '$input' '$__tool_directory__/Symlink/${input.element_identifier}';
#end for

]]>

Have a nice day

sRah_sa · October 7, 2021, 3:09pm

Hi!
I know it had passed so many times from your question but I have the same issue, if you remember about it, kindly let me know.

As I understood, by symlinking ln -s X Y , Y will be a linkname to the X content and if we open Y, X content will be shown.

I am writing a workflow with galaxy in which there is a task named calcvol which needs .nii.gz file to calculate the brain volume. However, as galaxy tends to change the files to .dat, I have two steps now, in the first step I make this linkage, then the second step is calcvol and I use the Y (here brain.nii.gz). The first step runs but the error is that this file does not exist. (Though I havn’t figure out yet that how the steps must be precised.

So I have the first command as:

  ln -s $nifti $(nifti.element_identifier)

and then the second command is:

/.../python -m calcvol --nifti_file '$nifti' --unit '$unit'

Also I have checked with this one too:

#import re
  /.../python -m calcvol --nifti_file '${re.sub('(?:^|\W)dataset.*', '/'+str($nifti.element_identifier), str($nifti))}' --unit '$unit'

But in either of situation I have the error:

brain.nii.gz does not exist

Also I try to use the brain.nii.gz as a text and not a data in the second file, and the error was stranger, showing other text as the file name.

I would appreciate it if can kindly help me.

Thanks in advance

Topic		Replies	Views
Custom tool XML - how can I get original file name (ie. bin1.fasta) and not just dataset_###.dat from input files? tool-dev , galaxy-local	7	1184	July 25, 2020
galaxy input file name changed	0	369	July 3, 2020
Custom galaxy tool specifying working directory tool-dev	1	1339	April 3, 2019
Creating tool which requires specific input file extension server-admin , tool-dev	2	449	September 14, 2022
how Galaxy deals with output files in XML tool definition file	9	1482	November 18, 2019

How should I store files using original file names and extensions when I upload them

Related topics