BWA MEM2 tool troubleshooting

Hi @huiping

To control for low quality bases, you can run some quality assurance on the original reads. This is a good topic covering the usual ways to do this. → Quality Control Start Here! multQC issue and guidance?

Then, to control for low quality alignments, you can Filter BAM reads out from the result. Removing reads that do not have the quality properties you are interested in retaining could involve only keeping primary alignments, or alignments from proper pairs, or alignments with a mapQ value of 20 or 30.

Then, depending on the protocol, you might want to Mark Duplicates or BamLeftAlign.

You can also tune parameters with the alignment tool itself. Most of the command line options are on the tool forms of the mapping tools and some have preset groupings for specific read types.

From what you have now:

Yes, you have an insertion that has shifted one of the reads in your group but none of those sequences appear to be good representatives of the reference sequence. Are you sure that all of this data was generated using the same exact reference sequence assembly? There seems to be more going on here. I would start by confirming the upstream steps. Did the mapping job use the same reference fasta as you are using in the IGV browser?

For a discussion about how different assembly versions can lead to issues, see this guide. It is focused on human but all genomes work the same way. Meaning, slight differences between versions of assemblies can lead to coordinate mismatch issues. → Reference genomes at public Galaxy servers: GRCh38/hg38 example

Let’s start there! :slight_smile: