PCA with ChIP-seq data by using DeepTools

Hello,

Im using DeepTools in order to generate a PCA plot from ChIP-seq data; I have found that suggestion in a previous (7d ago) posted answer (DiffBind: Generating a PCA in Galaxy). I made the input files for the plotPCA (DeepTools) with the multiBigwigSummary tool (DeepTools). My question is: the very first option to the multiBigwigSummary tool is if ``Sample order matters: YES or NO``; what does this mean? When should I care when my samples should be in particular order or not? Im asking, because I ran the tool two times, once with samples in particular order, and once in samples without a particular order; the final result from the plotPCA was different. Hence, I would like to understand the very first option of the tool multiBigwigSummary Sample order matters: YES or NO.

Best wishes,

Manolis

Hi @Manolis1

Under that option on the form is a description of what it means.

By default, the order of samples given to the program is dependent on their order in your history. If the order of the samples is vital to you, select Yes below.

To translate that…

  1. If left at “no”, then the order of the input datasets in the history – the dataset number – is used to order the input files, in a smallest to largest order. This is somewhat arbitrary but may not matter for some uses.
  2. If set to “yes”, then the order that you choose the input files in the select list is used instead.

When used outside of Galaxy, that same ordering would be determined by which file name was typed in first on the command line.

For how the math works with this tool, it is probably best to consult the original tool documentation. My guess is that the first dataset is setting some kind of normalization or bandwidth range for the average scores and binning, but that is a pure guess (based on how other tools tend to work)!

Now, this could important for other reasons: rerunning the tool in an attempt to reproduce a prior result exactly is one example.

Overall, it probably doesn’t matter too much for the scientific interpretation of the results.