PEAKachu peak caller in Galaxy CLIPseq training not calling the RBFOX2 peak published by van Nostrand 2016.

Hi everyone,

I am working through the Galaxy training tutorial to analyse CLIPseq data on the Galaxy / CLIP-explorer server. (CLIP-Seq data analysis from pre-processing to motif detection)
I was wondering if someone knows why the Peak caller ( PEAKachu) does not call the peak (Chr17 8,364 Mb - 8,370 Mb region) that was published in the paper on which the training set is based on. The fold enrichment was reported to be >8 for this which is well above the threshold set in the peak caller.
I also analyzed one complete replicate (REP2) from yeolab (original data) and the Peak caller still did not return the expected RBFOX2 peak)


1 Like

Dear Patrick,
The training set is just a tiny sub-sample of the original dataset and might not give you the peak you would find in the paper.

However, with the original data you should obtain the peak. Few notes, since the new version of DESeq2, it is required that you have two replicates for CLIP and control. Since the data by Van Nostrand et al. 2016 only has one replicate for the control, PEAKachu cannot calculate pvalues for the peaks and that is why you only get foldchanges. So be careful if you filter for pvalues, this might result in a false peak set. PEAKachu relies on DESeq2 to call peaks. Another point could be that the parameters are not optimize. Maybe set the MAD to 2.0 and fine tune with the two parameters minuminum cluster expression fraction and minimum block expression.

If you still struggle to get the required peaks than maybe try a different Peakcaller such as PureCLIP.

I hope I could help.

Have a good day and best wishes,