WGS Alignments that tolerate large, unknown, non-genomic insertions

Dhruv_Patel · August 16, 2021, 7:09pm

I completed a WGS alignment using 150PE fastq reads via BowTie2 to a locally installed plant genome. The alignment worked well, and I am able to visualize the reads by Integrated Genome Viewer (IGV). However at my site of interest, I see that I have a large set of paired reads where the pair is not mapped. I am thinking that I have a non-reference sequence insertion at that site that is not tolerated by BowTie2 and is discarded during sequence processing, and there doesn’t seem to be an intuitive way to find my discarded reads of interest.

Is anyone aware of workflows on Galaxy that may tolerate non-genomic sequences better? Maybe de novo genome assembly workflows that can handle large genomes (400Mbp)? Apologies in advance if this is not a Galaxy question, but is instead one for a more bioinformatically-oriented forum.

jennaj · August 16, 2021, 11:40pm

Hi @Dhruv_Patel

Reads that do not align with the reference can be output by the tool. See the option “Write unaligned reads (in fastq format) to separate file(s)” on the Bowtie2 tool form (near the top).

And you could certainly try assembly options to reconstruct any novel regions. Tutorials:

Topic		Replies	Views
Bowtie2 Mapping Issues usegalaxy.org support mapping	5	42	May 7, 2025
Genome of interest is not listed- Run a Bowtie2 when the reference genome are no listed in options usegalaxy.eu support server-admin , galaxy-local	1	327	November 29, 2022
genome alignment errors with Bowtie2 single-cell	2	736	August 14, 2019
bowtie2 problem custom-genome , mapping , exceeds-memory-error	1	575	August 26, 2019
First time user - Genome comparison usegalaxy.org support gtn-tutorial , dropbox	2	295	October 11, 2023

WGS Alignments that tolerate large, unknown, non-genomic insertions

Related topics