Is there a more straightforward way to split a BAM mapping file?? Currently from a BAM file I’m splitting a BAM with Split Bam which gives me a Mapped and Unmapped BAM. Then using Samtofastq to give me the paired reads, then having to compress the files back to a fastqsanger.gz, it seems there is probably a more straightforward way to do this???
Also, I have only been using a single reference for each mapping. Is there a way to use multiple reference sequences, then be able to extract the mappings for each one at the same time??
Hi @Jon_Colman
Samtools fastx can produce fastq or compressed version of format.
If you have many samples, Galaxy workflow might be a good option. Workflows can be added into the tool panel, so, you’ll get a personal “tool”.
You probably can concatenate reference genomes into “super-reference” and use it for mapping. You can filter BAM files on chromosomes and regions. You probably will need small files in BED format describing individual genomes. If you select several BED files describing individual assemblies for a BAM filtering jobs, you probably will get several BAM files with reads mapped to individual assemblies.
Most of that is a bit over my head, but trying to make a tool could be a good option. So essentially on the original question on getting back to an R1/R2 fastqsanger from a bam file.
I need to initially map my reads, results in BAM
I need to split my BAM into Mapped and Unmapped (Split Bam)
Then take the Mapped and Unmapped BAM, convert to fastqsanger.gz with Samtools fastx, then de-interleave.
I like the idea of concatenate my references together, I could remap individually much faster.