Here are a few tips for making heatmaps in R. We’ll use quantile color
breaks, so each color represents an equal proportion of the data. We’ll also
cluster the data with neatly sorted dendrograms, so it’s easy to see which
samples are closely or distantly related.
Let’s increase the values for group 1 by a factor of 5:
The data is skewed, so most of the values are below 50, but the maximum
Making a heatmap
Let’s make a heatmap and check if we can see that the group 1 values are 5
times larger than the group 2 and 3 values:
The default color breaks in pheatmap are uniformly distributed across
the range of the data.
We can see that values in group 1 are larger than values in groups 2 and 3.
However, we can’t distinguish different values within groups 2 and 3.
We can visualize the unequal proportions of data represented by each color:
With our uniform breaks and non-uniformly distributed data, we represent
of the data with a single color.
On the other hand,
data points greater than or equal to 100 are represented with 4 different
If we reposition the breaks at the quantiles of the data, then each color
will represent an equal proportion of the data:
When we use quantile breaks in the heatmap, we can clearly see that
group 1 values are much larger than values in groups 2 and 3, and we can
also distinguish different values within groups 2 and 3:
Transforming the data
We can also transform the data to the log scale instead of using quantile
breaks, and notice that the clustering is different on this scale:
Sorting the dendrograms
The dendrogram on top of the heatmap is messy, because the branches are
Let’s flip the branches to sort the dendrogram. The most similar
columns will appear clustered toward the left side of the plot. The columns
that are more distant from each other will appear clustered toward the right
side of the plot.
Let’s do the same for rows, too, and use these dendrograms in the heatmap:
Rotating column labels
Here’s a way to rotate the column labels in pheatmap (thanks to