Different Results in CRISPR Screening Tutorial

I followed the CRISPR Screening Analysis tutorial and used the same data, but I’m getting different results. The sgRNA counts and volcano plot results are significantly different from what’s shown in the tutorial. Is it normal for the analysis to yield different results, or could I have missed a step in the process?




Summary

This text will be hidden

Hi @phe0908

Yes, there could be a few reasons for this. Did you run the tutorial’s workflow on the input sample data, or did you do this by hand directly? Maybe try running the workflow to create an “answer key” history to compare to, and you might find where the mixup happened.

If that still produces a problem, it helps to try to reproduce it exactly to sort out where the problem is. If the workflow has some problem, we can follow up to get that fixed and you a workaround.

This was the tutorial, correct?

Which server did you run this at? The URL at the top of your browser window is what we would be interested in. Using one of the “Known Working” servers on the tutorial is usually a good idea. Others can work but maybe the reference data is different, or tool versions are different, and that might not be obvious and cause problems when you are learning.

You can even share back your history that has the output of the workflow run. See → How to get faster help with your question. That share link can be posted back here in the reply. If you had to make any modifications to the workflow when you imported it, you can explain and share that workflow back, too, and we can explain how to fix it up.

Let’s start there! Let us know if you solve this yourself and we can definitely follow up! :slight_smile:

Update

After reviewing this tutorial and the workflow I was reminded of a few places where this analysis can go wrong. I’m still looking at this more to see if I can reproduce exactly where yours went wrong, but maybe these early results help? You can still welcome to share back what you have for more specific help.

  1. For now – use the data and methods in the hands-on, not the workflow.
  • The parameters for the QA steps in the workflow deviate slightly versus the hands-on methods, specifically the CutAdapt step.
  • This could definitely influence results since the difference is with respect to the use of specific adaptors and the orientation of those adaptors. Trimming matters a bit more for this type of sequencing due to the other sequence tags involved.
  • I suspect the hands-on is a correct match for the current reads included. This tutorial was updated recently and might have some more work pending. More feedback about this later.
  • This won’t matter too much since the data here is mostly an example for how to do these steps, not so much that they are scientifically interesting.
  1. Be sure to use the pre-computed counts for downstream data exploration.
  • There is an intentional switch over from the initial fastq read subset and the full sized fastq reads after the first significant data reduction step. The subset count dataset will either not produce the end result graphics similar to the full sized counts, or the trimming difference above had an impact.
  • Specifically, at the Hands-on: Test for enrichment step running MAGeCKs test the pre-computed count file uploaded at this earlier step should be used.
  • There are comments in the tutorial when that data is uploaded that explain a bit more about why we did this. In short: learning how to do those early manipulations with real data is useful, but the “full sized” real data takes too long to process for a practical short tutorial.
  • When running your own data through a similar processing, computational processing time will be built into the experiment just like any other analysis would accommodate for it.

More soon, thanks! :slight_smile: