Lint failure for deprecated feature, need advice on the best way to transition QIIME 2 Tools

ebolyen · June 7, 2023, 6:06pm

Hello!

I’m one of the lead developers of QIIME 2 and generally the developer behind q2galaxy which renders all of our actions into galaxy tools.

I was in the process of rerendering our latest release for publication on the toolshed, and ran into some lint errors from Planemo.

Since the features I’m using are deprecated, it doesn’t really seem like something for planemo’s issue tracker, since I’m already kind of doing the “wrong” thing.

Here’s some more context:

In QIIME 2 we abstract data into what we call artifacts which contain both a semantic type (which defines what kinds of actions you can run) and a format (which defines what kinds of automated transformations need to be done). This information, provenance, and the raw data itself (in some arbitrary but ideally obvious file format) are all zipped together into a single file we call a .qza (the artifact).

At the user interface level, the only relevant information is the semantic type, however this set is extensible and depends on the plugins that are defined. So to make QIIME 2 work in Galaxy, we defined a generic file format (QZA) which sets Galaxy metadata including the semantic type.

Then we can filter our allowable inputs based on the metadata, instead of the format. This gives us the ability to continue to extend the vocabulary of types, without needing to add a new format to Galaxy each time a plugin decides to do so (there are many, so this would be pretty intractable).

Currently, this is done with the following (deprecated, but suggested at BOSC 2018) syntax:

github.com

qiime2/galaxy-tools/blob/11724243061b16476878f0eda7b10c44f35b3f99/tools/suite_qiime2alignment/qiime2alignment__mafft_add.xml#L24-L29


      
          <param name="alignment" type="data" format="qza" label="alignment: FeatureData[AlignedSequence]" help="[required]  The alignment to which sequences should be added.">
              <options options_filter_attribute="metadata.semantic_type">
                  <filter type="add_value" value="FeatureData[AlignedSequence]"/>
              </options>
              <validator type="expression" message="Incompatible type">hasattr(value.metadata, "semantic_type") and value.metadata.semantic_type in ['FeatureData[AlignedSequence]']</validator>
          </param>

The weird part is the options_filter_attribute combined with filter add_value to create a set that is matched against the available QZAs. Again, this essentially is mimicing the format filtering that Galaxy does, but the at the metadata level. This however is a pretty obscure use, and I had to add support for multiple add_value flags to Galaxy at the time.

Then I added a validator for situations where the user may manually select some option not in the generated options list.

The issue I’m now encountering is that Planemo doesn’t recognize this as a valid combination:

 +Linting tool /home/runner/work/galaxy-tools/galaxy-tools/tools/suite_qiime2__alignment/qiime2__alignment__mafft_add.xml
Failed linting
Applying linter tests... WARNING
.. WARNING: No tests found, most tools should define test cases.
Applying linter output... CHECK
.. INFO: 1 outputs found.
Applying linter inputs... FAIL
.. ERROR: Data parameter [alignment] filter needs to define a ref attribute
.. ERROR: Data parameter [alignment] for filters only type="data_meta" is allowed, found type="add_value"
.. ERROR: Data parameter [sequences] filter needs to define a ref attribute
.. ERROR: Data parameter [sequences] for filters only type="data_meta" is allowed, found type="add_value"
.. INFO: Found 7 input parameters.

So my question is, what should we do now? It seems like data_meta is almost, but not quite the kind of filter we need, as it seems to be concerned with the metadata of a sibling parameter. And there’s not a clear way to define a list of valid options.

What would be our best approach here? Is this actually an issue with Planemo, or have we just kind of fallen into the inevitable state of a deprecated feature.

I think it’s fair to say that there’s a substantial amount of interest in Galaxy from our users, so if there was a need to create an alternative non-deprecated way of making this happen (some kind of explicit metadata based filter), we could contribute towards that effort. In fact, our semantic types are actually a little bit more general than a list (they form an algebra amongst themselves), so improving support beyond that (I’m not sure how exactly at this moment) could actually be pretty helpful.

I do also acknowledge that the way QIIME 2 thinks about data is pretty alien relative to the world of workflow-description languages which generally have a pretty shallow concept of file formats. And that integrating with those may be the more relevant direction for your project, so we’re definitely coming from a completely unrelated angle which may not be of much interest.

But I am exceptionally pleased at how robust Galaxy’s interface has been, that filtering to the semantically relevant actions was possible at all. WDL and CWL (and interfaces for composing them) do not share that capacity, and make for pretty disappointing interface targets for us, relative to Galaxy’s tool XML.

jennaj · June 7, 2023, 6:27pm

Hi @ebolyen

The IUC would be the best people to help with this. Maybe there is another way to model the metadata, not sure.

I’ve cross-posted this over to their chat for followup. They may reply here or there, and feel free to join the chat. You're invited to talk on Matrix

Hope this works out! We have started to get more questions about the tool suite at this forum, and I’m not always sure how to help. I’ll add a tag to your topic that will link to that prior Q&A for context Getting this working smoothly would be very good for everyone!

bernt-matthias · June 7, 2023, 7:54pm

First of all I guess that the linter is wrong here. The linting code is defined here: https://github.com/galaxyproject/galaxy/blob/77e885285eecbe38d16afad27345ea2955619748/lib/galaxy/tool_util/linters/inputs.py#L202 I guess I’m responsible for most of it … when I wrote this I used the available docs and my understanding of the code … which are both incomplete

So, this can be fixed. It would be nice if you could file an issue and/or open a PR (a functional tool test would also be nice… if there is none yet). In any case changes to the linter take a very long time until they are used in planemo (so that is no fast solution).

Also: Galaxy will likely support “deprecated” features of tool syntax for a long time. The term is only used to discourage the use if possible … to my understanding.

The deprecation notice has been added here: Make tool parameter options capable of referring to metadata file by qiagu · Pull Request #11832 · galaxyproject/galaxy · GitHub … seems that I also have done this … Maybe I was wrong. If I was right I would assume that the new functionality added in this PR should be able to solve your task. Maybe worth a try.

So what are other the alternatives: Maybe a validator: https://github.com/galaxyproject/tools-iuc/blob/5c1b8a2b105a80e236f88e71a743147d79925ac4/tools/mitos/mitos.xml#LL45C1-L45C1? Problem is that users can select wrong data sets, but they will see a warning and won’t be able to submit the tool. Probably not that user friendly.

Bottom line is that in my opinion you can ignore these linter warnings (as long as the tools work) and I can only suggest to try the new features implemented in the referred PR.

But I am exceptionally pleased at how robust Galaxy’s interface has been,

That’s great feedback … I will forward it to the devs …

ebolyen · June 8, 2023, 5:45pm

Thanks @bernt-matthias!

That’s a relief to hear, I figured our time was already up just as we were getting started

Re: linting, that all makes sense, I’ll disable our lint job for the moment. And then we can see about fixing the linter or at least getting a good writeup/test-case written for planemo.

Re: the new functionality, if we can use from_dataset to refer to the parent’s own property, then I think it might work. This fragment reads like something that would do what we need, but I’m not certain it will work, as the emphasis is definitely on a sibling parameter. But, that does seem like a really minor thing and I bet we could help extend that to self-reference if needed, since it would just extend what it is already for!

  <param name="alignment" type="data" format="qza" label="alignment: FeatureData[AlignedSequence]" help="[required]  The alignment to which sequences should be added.">
      <options from_dataset="alignment" meta_file_key="semantic_type">
          <filter type="add_value" value="FeatureData[AlignedSequence]"/>
      </options>
  </param>

The linter might still dislike the add_value being nested, but that also seems workable.

Converting

 <options options_filter_attribute="metadata.semantic_type">

to

 <options from_dataset="alignment" meta_file_key="semantic_type">

Should not be so hard.

Topic		Replies	Views
Missing defined datatypes in community tool suite: qiime2_wrappers metagenomics , qiime2	7	1333	April 2, 2019
unable to use .qza files qiime2 , qiime2_core__tools__import	11	77	September 9, 2024
the metadata and error of qiime2 import usegalaxy.org support qiime2 , qiime2_core__tools__import	3	47	November 4, 2024
Qiime2 - q2 tools import usegalaxy.eu support qiime2	2	754	February 17, 2024
Unable to import my files as .qza usegalaxy.org support qiime2_core__tools__import	3	19	April 4, 2025

Lint failure for deprecated feature, need advice on the best way to transition QIIME 2 Tools

Related topics