am a complete newbie to bioinformatics. Currently doing ChIP-Seq on samples looking for enrichment at sites for increased/decreased expression of a retinoic acid receptor with an agonist/antagonist. I relied on online tutorials such as this one by abcam: https://www.abcam.com/webinars/a-step-by-step-guide-to-chip-seq-data-analysis-webinar and Youtube videos.
The data I was given are paired-end reads. Because there are 15 samples and each went through multiple runs, I have 40 .fastq.qz file for each sample.
These are the parameters used on Galaxy:
- Trimmomatic: Sliding window to cut bases above quality score of 30 and above AND drop reads that are below quality score of 30
- Align using bowtie2 using default settings for paired-end reads
- Filter SAM or BAM, output SAM or BAM files on FLAG MAPQ RG LN or by region: minimum MAPQ quality score of 30 and filter on bitwise flag > skip alignments with unmapped reads
- Samtools sort using default settings
- Samtools merge using default settings
- MACS2 call peak: paired end BAM (BAMPE) & default settings
I initially used quality scores of 20 to trim bases and filter unmapped reads. The peak calling showed very little difference between the peak in the ChIP sample and the control. I then ran them through the plot fingerprint tool on Galaxy and they showed very weak enrichment. I was advised by a colleague to up the quality score to 30 to try to reduce the background noise on the control but that didn’t work either.
There is a possibility that there were not enough wash steps done in doing the ChIP or perhaps there is something in the analysis steps that wasn’t quite right.
Broad and subtle peaks are expected so were there any tools I should have used instead eg MACS2 bdgbroadcall? I am not sure what parameters to use and have not seen anyone use this.
Any advice would be welcome.
Images of the plotfingerprint tool results and peaks after peak calling on the USCS browser can be viewed here on Google Drive.