Hi all,
Can I perform RNAseq analysis on samples in FastQ format and samples in BAMs format at the same time? since I cannot convert BAMs back to FastQ because I do not know the tools used to generate these BAMs.
Hi @wmsalsah
If you don’t know how the BAMs were constructed, start over with the reads for all samples.
Any intact BAM could have the fastq sequences extracted. Example tool: Operate on and transform BAM datasets in various ways
These tutorials can help with understanding fastq data (format/content/variants) along with example QA steps/tools. Also see domain/topic tutorials for protocol-specific QA steps.
There will be a training event in May that you might be interested in. You could get started with the Intro tutorials at any time, then work through the more advanced topics during.
Hi jennaj,
Thank you very much for your reply
I believe that converting BAMs back to FastQ with a tool that was not used to generate the BAMs or with a different version may result in systematic errors (incorrect sequence). For the samples that I have, unfortunately, there is no way to know what tool was used to generate the BAMs files.
So, can you advice please?!
Hi @wmsalsah
What tool produced that BAM should not matter. You would be just extracting the reads out of it, and leaving the rest of the information behind.
The BAM header may have more details about prior mapping jobs but for this purpose it would make no difference. Even if it is just a BAM file with “sequences only” (not mapping results), extracting sequences is possible with this or related tools (search the tool panel by datatype).
Maybe try and see what results? Or, you could choose to not use this data in your analysis. Incorporating a BAM dataset with unknown provenance is not recommended for any analysis, in Galaxy or otherwise.