Compare distributions with box plots, not bar plots
In many scientific journals, authors use bar plots to compare two or more distributions. Often, the error bar is only present for the upper limit and not for the lower limit. Sometimes, the bar for the control group has no error bars due to data normalization. Here, I simulate a small experiment to illustrate why this normalization is problematic and why box plots are better than bar plots for comparing two distributions.
Let’s simulate a simple experiment where the control samples have a value centered around 1 and the experimental samples have a value centered around 2. We might imagine that these values represent gene expression or some other measure of interest.
With normal distributions, we can use the t-test to compare the distribution of values from the control group and the experimental group. This helps us to determine if the difference between the two distributions is statistically significant.
|-1.28566||1.206106||2.491766||-2.525504||0.0407433||6.739684||-2.498911||-0.0724094||Welch Two Sample t-test||two.sided|
After normalizing the data so that the controls are to equal 1.0, we no longer test for a difference between control and experimental distributions. Instead, we now test if the experimental distribution is different from 1.0:
|-1.196441||1||2.196441||-8.018042||0.0013126||4||-1.610738||-0.7821436||Welch Two Sample t-test||two.sided|
In this second test, the mean value for the control samples is 1 instead of 1.2. We can see that the t-statistic -8 is inflated relative to the correct value of -2.5 and the p-value is lower than it should be.
Journals often publish bar plot figures that do not clearly communicate the results to the reader:
- The data is displayed as a bar plot of mean or median values.
- The lower bounds of error bars are omitted.
- Legends do not explain the meaning of the bars or error bars.
Below, I show the data represented as bars of mean values with error bars of standard deviations.
Notice that the figure on the right might appear to show that there is a very significant difference between the experimental and control groups.
Normalizing the data so that the controls are equal to 1.0 causes two effects:
- The technical variation between experiments is hidden.
- The difference between the control and the experimental groups is exaggerated. This is because we’re comparing the experimental distribution to 1.0 instead comparing it to the control distribution.
A box plot is a good way to compare two or more distributions. Here’s the anatomy of a boxplot created with ggplot2:
I would like to see data presented with box plots as shown below instead of bar plots. I think the scientific question, “Are the distributions different from each other?”, comes naturally from this kind of presentation.
On the left, notice that the range of values in the control group overlaps with the range of values in the experimental group. The variation in the control group tells us something about the amount of technical variation between repeated experiments. If the technical variation is too high, we might have some reason to be skeptical about the reproducibility of the assay.
On the right, we hide the variation between control samples, so we can make no assessment of the technical variation in the experiment. This point is more obvious for the box plot than it is for the bar plot above.