Filter by attribute of other data table

Hi everyone,

I’m working on a tool where the user selects a proteome, and a bunch of MHC-I alleles. Each proteome has an organism associated with it, and each allele also has an organism associated with it. My goal here is to make it so that the list of available alleles to select is filtered by the proteome selected by the user. For example, if the user selects a human proteome, they should only be able to select HLA-* alleles; if the user selects a mouse proteome, they should only be able to select H-2* alleles.

This part of my tool XML file lets the user select the proteome:

<param name="proteome" type="select" label="Select a proteome">
      <options from_data_table="Proteomes" />
    </param>

This is the part of my XML file that lets the user select the Allele:

  <param name="allele" type="select" label="Select an allele">	 
       <options from_data_table="Alleles" >
           <filter type="param_value" column="organism" ref="proteome" ref_attribute="organism"/>	            </options>
 </param>

The Alleles.loc file can be found here: https://github.com/mrForce/immunoGalaxy/blob/dd543628c10052a34ed11a3bbb408f2da8c238b0/Alleles.loc

The Proteome.loc file can be found here: https://github.com/mrForce/immunoGalaxy/blob/dd543628c10052a34ed11a3bbb408f2da8c238b0/Proteomes.loc

When I try to use the tool, I get “No options available” for the Allele dropdown menu. After trying a whole bunch of variants of the filter element, I eventually gave up and just inserted log statements into the ParamValueFilter class in lib/galaxy/tools/parameters/basic.py. This is the relevant part of the code:

class ParamValueFilter(Filter):
    def __init__(self, d_option, elem):
        Filter.__init__(self, d_option, elem)
        self.ref_name = elem.get("ref", None)
        assert self.ref_name is not None, "Required 'ref' attribute missing from filter"
        column = elem.get("column", None)
        assert column is not None, "Required 'column' attribute missing from filter"
        self.column = d_option.column_spec_to_index(column)
        self.keep = string_as_bool(elem.get("keep", 'True'))
        self.ref_attribute = elem.get("ref_attribute", None)
        if self.ref_attribute:
            self.ref_attribute = self.ref_attribute.split('.')
        else:
            self.ref_attribute = []
      def filter_options(self, options, trans, other_values):
            if trans is not None and trans.workflow_building_mode:
                return []
            log.warning('JORDAN: other values keys: ' + ', '.join(other_values.keys()))
            ref = other_values.get(self.ref_name, None)
            log.warning('JORDAN: ref attributes' + ', '.join(dir(ref)))
            log.warning('JORDAN: ref class' + str(ref.__class__.__name__))
            log.warning('JORDAN: ref string: ' + str(ref))
            for ref_attribute in self.ref_attribute:
                if not hasattr(ref, ref_attribute):
                    log.warning('JORDAN: no attribute')
                    return []  # ref does not have attribute, so we cannot filter, return empty list
                ref = getattr(ref, ref_attribute)

            ref = str(ref)
            rval = []
            for fields in options:
                if self.keep == (fields[self.column] == ref):
                    rval.append(fields)
            return rval

The really important information revealed by this logging is the fact that ref is a unicode string, and is simply the “value” field of the selected proteome (in this case: /galaxy-prod/galaxy/tools-dependencies/references/Proteomes/Human.fasta). Thus, information about the other fields of the proteome aren’t carried with ref; so no matter what ref_attribute is, I won’t be able to filter on the proteome’s organism. I’ll post the relevant part of the logging down at the bottom of this post.

Is this a configuration problem? If so, is there a way to address it? This is the relevant part of my tool_data_table_conf.xml:

<table name="Proteomes" comment_char="#">
      <columns>value, name, project, organism</columns>
      <file path="tool-data/Proteomes.loc" />
 </table>

<table name="Alleles" comment_char="#" allow_duplicate_entries="True">
  <columns>value, name, organism</columns>
  <file path="tool-data/Alleles.loc" />
</table>

Any help/insight you can provide would be greatly appreciated,

–Jordan

The relevant part of the logging is this:

galaxy.tools.parameters.dynamic_options WARNING 2020-01-30 21:39:27,299 [p:29294,w:1,m:0] [uWSGIWorker1Core1] JORDAN: other values keys: index, allele, search_type_selector, length, rank_filter, current_case, mhcalleles, num_matches_per_spectrum, mgf, proteins, search_type, instrument, proteome, frag_method
galaxy.tools.parameters.dynamic_options WARNING 2020-01-30 21:39:27,299 [p:29294,w:1,m:0] [uWSGIWorker1Core1] JORDAN: ref attributes__add__, class, contains, delattr, doc, eq, format, ge, getattribute, getitem, getnewargs, getslice, gt, hash, init, le, len, lt, mod, mul, ne, new, reduce, reduce_ex, repr, rmod, rmul, setattr, sizeof, str, subclasshook, _formatter_field_name_split, _formatter_parser, capitalize, center, count, decode, encode, endswith, expandtabs, find, format, index, isalnum, isalpha, isdecimal, isdigit, islower, isnumeric, isspace, istitle, isupper, join, ljust, lower, lstrip, partition, replace, rfind, rindex, rjust, rpartition, rsplit, rstrip, split, splitlines, startswith, strip, swapcase, title, translate, upper, zfill
galaxy.tools.parameters.dynamic_options WARNING 2020-01-30 21:39:27,299 [p:29294,w:1,m:0] [uWSGIWorker1Core1] JORDAN: ref classunicode
galaxy.tools.parameters.dynamic_options WARNING 2020-01-30 21:39:27,299 [p:29294,w:1,m:0] [uWSGIWorker1Core1] JORDAN: ref string: /galaxy-prod/galaxy/tools-dependencies/references/Proteomes/Human.fasta
galaxy.tools.parameters.dynamic_options WARNING 2020-01-30 21:39:27,299 [p:29294,w:1,m:0] [uWSGIWorker1Core1] JORDAN: no attribute