Avoid one history item per output for tool execution with many output files

Hi,

we have a tool which creates many (>100) output files, and would like to achieve a single history item (or whatever you would call it) from the tool with the list of output files with one download button.

We have managed to configure the tool to create a list (dataset) but in addition we get all the 100 history items, one per output file in addition. See the screenshot.

The relevant lines in the tool are

 <collection name="output_MC" type="list" label="Conventional Analysis for MC">
              <filter>input == "MC"</filter>
             <discover_datasets pattern="(?P&lt;designation&gt;.+)\.root" directory="Histograms/MC/" />
</collection>

How to only get the List called “Conventional Analysis for MC” in the screenshot, and not all the rest of the hist.MC. items in History?

Help much appreciated!

Thanks,
Maiken

Hi @maikenp,
I believe that due to the inner logic of Galaxy collections every member of such a collection has to be represented as an independent dataset in the history. This is usually transparent to the user because the datasets are given the “hidden” flag. I suspect the tool you’ve made is not attaching such flag - check out the visible attribute in the discover docs: Galaxy Tool XML File — Galaxy Project 21.01 documentation

edit: actually I think you can just drop the <filter> tag, that is probably duplicating your outputs.

1 Like

We will try, thanks.

The full tool wrapper looks like this by the way:

<tool id="ConventionalAnalysis" name="Conventional Analysis">
<description></description>
<requirements>
<requirement type="package" version="1.0.0">fys5555_py3</requirement>
</requirements>
<command>
<![CDATA[ 
python '/storage/software/src/FYS5555/Conventional_Analysis/CodeExample/ru nSelector.py' '$input';
]]>
</command>
<inputs>
<param name="input" type="select" label="Select Specific for above selection" help="">
       <option value="Data" selected="true">Data</option>
       <option value="MC">MC</option>
</param>
</inputs>
<outputs>
		<data name="output_Data" from_work_dir="Histograms/Data/hist.Data.2016.root" >
	  <filter>input == "Data"</filter>
 </data>
 <collection name="output_MC" type="list" label="Conventional Analysis for MC">
	  <filter>input == "MC"</filter>
	 <discover_datasets pattern="(?P&lt;designation&gt;.+)\.root" directory="Histograms/MC/" />
</collection>
</outputs>
</tool>

Hi again.

Thanks very much for you help. Removing the filter tag worked beautifully. And now the separate output files are hidden by default.

This is what we ended up with:

<tool id="ConventionalAnalysis" name="Conventional Analysis">
<description></description>
<requirements>
<requirement type="package" version="1.0.0">fys5555_py3</requirement>
</requirements>
<command>
 <![CDATA[ 
 python '/storage/software/src/FYS5555/Conventional_Analysis/CodeExample/runSelector.py' '$input';
  ]]>
</command>
<inputs>
<param name="input" type="select" label="Select Specific for above selection" help="">
      <option value="Data" selected="true">Data</option>
      <option value="MC">MC</option>
</param>
</inputs>
<outputs>
	 <discover_datasets pattern="(?P&lt;designation&gt;.+)\.root" directory="Histograms/" recurse="true" visible="false" />
 </collection>
 </outputs>
</tool>

Hi @maikenp,

That is not a valid output section and the tool would fail to load at this point.
It should probably be

<outputs>
     </collection>
         <discover_datasets pattern="(?P&lt;designation&gt;.+)\.root" directory="Histograms/" recurse="true" visible="false" />
     </collection>
</outputs>

Your initial tool looks correct and the filter is supposed to work.
visible="false" is not needed, that is the default setting, although it doesn’t hurt adding.
I’d strongly suggest linting and testing tools with planemo

1 Like