I saw in your tutorial on ATAC-Seq (https://training.galaxyproject.org/training-material/topics/epigenetics/tutorials/atac-seq/tutorial.html) on the insertsize metrics, so one of the plots had blue regions corresponding to reverse forward-oriented. Should I fix it and how? I have 75 bp length, and if I trim it to 36 bp I do not see this issue.
After filtering out duplicates/chrM/non-paired reads, if you have a small fraction of reads aligning in the opposite orientation of the expected/majority orientation, it won’t matter. These will fall out in downstream steps (won’t pass peak calling criteria).
Don’t trim your reads arbitrarily – you want the full length, otherwise, you will lose valuable sequence data.
The reason the “wrong orientation” pairs are not showing up in the graph after trimming is that the shorter reads are not mapping at all and/or are not properly pairing uniquely. That kind of mapping result will impact your entire dataset, not just the reads mapping with the unexpected orientation. If you check your alignment stats trimmed versus untrimmed, you’ll be able to see the difference (how much data you are losing).
Yes I do lose significant amount of reads and see a difference in alignment. I would like to use 75bp and after aligning If I would like to focus on nucleosome free regions (<120bp) for instance, would these reverse reads be picked up because they are mostly falling under this cutoff?