What type of information is represented on each axis of the MA plot? also what do the dots and the colors of the dots represent?
One example of an MA plot is in the RaceID Tutorial:
Here we see a comparison of two groups: the cells in cluster 1, and the cells in cluster 3.
On the Y-axis we see the difference in expression of the two clusters, and on the X-axis we see the average contribution of the two clusters.
Each dot is a gene, and the genes that we think show the most differential expression are highlighted in red and labelled. The rest are grey and unnamed.
The top half of the plot shows us genes expressed highly in cluster 3, and the bottom half of the plot shows us genes expressed highly in cluster 1.
What we are looking for are genes that are highly expressed in one cluster and lowly expressed in the other. This means that extreme dots in the Y-axis (both negative and positive) are useful to us, such as Top2a and Papss2.
However, because the average expression of these two genes is rather low (< 0 on the X-axis), there is a certain lack of confidence in how much we trust them to be fantastic indicators of differential expression.
For example if a gene has an expression of 1 in cluster 1, and 3 in cluster 3, then this would give a 3 fold expression change for that gene(!), but the values are so small that + or - 1 to either value heavily changes the fold change, so it is not thought of as robust.
So, we move up the X-axis towards genes which have a higher average expression and thus more trustworthy values. Due to the large average expression, we don’t need such extreme values to confirm a differential expression between the two clusters.
Here we see Ptma near the top-right, which has significantly more average expression, and is still reasonably higher expressed in cluster 3 than cluster 1.