filter bam datasets on a variety of attributes (with tag)

I have a question (or maybe a bug) to report.
I am using the “filter bam datasets on a variety of attributes” at use galaxy.org
For bowtie2 generated bam files, Seems the XS: & AS: are not working properly.
I tried bam files generated with STAR, filtering by AS: is working as expected with eye inspection.
I am wondering whether the “filter” do not work properly with tags having negative values.
Thanks a lot if you can clarify on this.

Eg: filter a bowtie2-generated bam file with “AS:<-5” will give me a bam file with AS field all =0 (with eye inspection.)

@jennaj

Hi @Lidong

I’m not sure how you are comparing or what the goal is, but technically this query wouldn’t find any negative “AS” values in RNA-Star output either. What the alignment score represents is different between these two tools and not directly comparable.

  • Bowtie2 maps ungapped reads (unspiced) – example: WGS data. “AS” in `Bowtie2* BAMs is a computed score – “AS=0” represents a perfect ungapped match between the query (read) and the target (genome?) for all bases in the query.
  • RNA-star maps gapped reads (spliced) – example: RNA-seq data. “AS” in RNA-Star BAMs is a summary metric – roughly the number of matched bases minus mismatches/indels/etc with gaps expected.

The “tag” filtering option is not part of the original underlying tool (Bamtools) but instead included in the Galaxy wrapper as a custom filter. I can confirm that negative values are not interpreted – which would explain what you found. If you are interested in exploring the characteristics of an alignment (mismatches, gaps, indels) other tags in the BAM would be better choices. Perhaps the tutorials below will help?