Hi,
I use the RPStblastn tool.
The output I set is an XML BLAST. I would like the ‘query-def’ to be extracted to ‘query-id’ with the advanced option ‘Should the query and subject defline(s) be parsed?’ set to ‘yes’ (as for command line).
This doesn’t work.
When I compare the output with and without the option, I get exactly the same output, as if the option was not taken into account.
Is this normal?
Here’s an example:
How it should work:
Without option:
<Iteration_query-ID>Query_1</Iteration_query-ID>
<Iteration_query-def>NODE_1_length_506_cov_10.687361</Iteration_query-def>
With option:
<Iteration_query-ID>NODE_1_length_506_cov_10.687361</Iteration_query-ID>
<Iteration_query-def>No definition line</Iteration_query-def>
Everything on fasta > title lines before the first whitespace is the “identifier”, and everything after is the “description”. This is how it works everywhere, not just Galaxy and not just BLAST.
So, it looks like the tool attempted to split the title line on the first whitespace, only found one value, and sorted that out differently between the two XML tags based on that advanced option. The first version has a tiny bit more information (auto-generated unique key for the query sequence). This option might matter more if the sequence identifiers were public keys that you wanted to do something with, or if the query fasta actually had a meaningful description, but since this data doesn’t have anything, this seems like a preference.