+ - 0:00:00
Notes for current slide
  • Hello! I'm Kamil Slowikowski, the creator of ggrepel.

  • I'm also a PhD student in Bioinformatics at Harvard.

  • I want to show you a brief introduction to ggrepel, so you can get started using it in your figures.

Notes for next slide
  • Oftentimes, we want to know the identity of each data point in a figure.

  • Unfortunately, geom_text() does not always work as expected:

    • text labels often overlap with each other

    • sometimes they fall outside the plotting area

Introduction to ggrepel

github.com/slowkow/ggrepel

July 2018

Kamil Slowikowski
{{< bluesky >}}

1 / 19
  • Hello! I'm Kamil Slowikowski, the creator of ggrepel.

  • I'm also a PhD student in Bioinformatics at Harvard.

  • I want to show you a brief introduction to ggrepel, so you can get started using it in your figures.

😞 Problem: text placement


library(ggplot2)
ggplot(mtcars) +
aes(
x = wt, y = mpg,
label = rownames(mtcars)
) +
geom_point(color = "red") +
geom_text()

2 / 19
  • Oftentimes, we want to know the identity of each data point in a figure.

  • Unfortunately, geom_text() does not always work as expected:

    • text labels often overlap with each other

    • sometimes they fall outside the plotting area

🎉 Solution: ggrepel


library(ggrepel)
ggplot(mtcars) +
aes(
x = wt, y = mpg,
label = rownames(mtcars)
) +
geom_point(color = "red") +
geom_text_repel()

3 / 19
  • This problem motivated me to create ggrepel:

    • an extension for ggplot2 that automatically places text labels without overlaps
  • I tried to make it very easy to use:

    • just replace geom_text() with geom_text_repel()

😊 Much better!

4 / 19
  • Side by side, you can see that the figure using ggrepel is much easier to read because the text is clearly visible.

Repel text labels away from


  • other text labels

  • data points

  • edges of the plotting area

5 / 19
  • The idea behind ggrepel is very simple.

  • We want to repel text labels away from:

    • other text labels

    • data points

    • and edges of the plotting area

📜 Algorithm

O(n2) N-body physical simulation

6 / 19
  • I implemented a brute force algorithm of a physical simulation.

  • We iterate over all pairs of text labels and repel them away from each other.

  • We use a spring force to pull each text label back to its own data point.

  • Let's see it in action!

🐎 ggrepel in action

7 / 19
  • This is an animation showing each step of the simulation on a loop.

  • The labels repel away from each other, and away from data points.

  • There is a spring force that pulls each label back to its own data point.

  • Notice that "Honda Civic" first moves away from its data point, and then it is pulled back until it is directly adjacent.

🐎 ggrepel in action

8 / 19
  • ggrepel works well with RStudio.

  • When you resize the plotting area, it will automatically recompute the overlaps and adjust the label positions.

💾 Installation

Install ggrepel from CRAN:

install.packages("ggrepel")

9 / 19
  • ggrepel is easy to install and only depends on ggplot2. It has no other dependencies.

  • This has quickly become the most popular piece of code I have ever written.

  • I've learned that people are happy when something just works.

  • Now let's take a look at a practical example.

🌋 Example: Volcano

Which genes show significant differential expression?

Thanks to Stephen Turner for the example data.

10 / 19
  • In bioinformatics, we often do a differential gene expression test.

  • Then, we ask: which genes show significant differential expression?

  • With ggrepel, we can actually read the gene names. That's great.

  • However, these figures are not always easy to read.

  • Sometimes ggrepel is not the best choice...

🤔 Consider other options

  • ggrepel is not always the best choice

  • Sometimes other plots are easier to read

11 / 19
  • It is not always a good idea to add text labels to your figure.

  • Sometimes other plots are easier to read.

  • If possible, try to keep your figures easy to understand.

  • Because if you're not careful, you might get a surprising result...

🤭 Don't label too many points!

Or else you will end up with @accidental__aRt

12 / 19
  • I see a lot of figures in the wild with too many text labels!

  • To avoid this situation, you might consider labeling a small subset of your data points.

💡 Use the empty string ""

library(ggrepel)
d <- subset(
mtcars, wt > 3 & wt < 4
)
# Just label 3 items.
d$car <- ""
i <- c(2, 3, 16)
d$car[i] <- rownames(d)[i]
ggplot(d) +
aes(wt, mpg, label = car) +
geom_point(
color = ifelse(
d$car != "",
"red", "grey50"
)
) +
geom_text_repel()

13 / 19
  • You can use the empty string to hide most of the labels.

  • Then you can add labels for just a few data points.

  • By using the empty string strategy, we are saying that we want the unlabeled data points to continue repelling the text from the labeled data points.

🎓 Learn from examples in the vignette

vignette("ggrepel") # <- Run this command in RStudio
14 / 19
  • For more examples, check out the vignette and feel free to copy code.

  • If you have a new example you'd like to share, please send it along!

🐛 Please report bugs

github.com/slowkow/ggrepel/issues


🎁   Contributions are very welcome!

🙌   We have 8 contributors so far.

❓   Stackoverflow is the best place to ask questions.

15 / 19
  • We have many open issues, and I don't have time to fix all of them.

  • If you want to contribute, please let me know and I'll do my best to get you started.

  • For questions about using R and making figures, I like to use Stackoverflow

  • If you want to see more examples of ggplot extensions...

  • The ggplot2 extension gallery has lots of examples that might meet your needs.

  • After browsing, you might also get an idea for creating a new extension that is useful for your own work!

  • If you want to make your own extension... I have some links for you.

🛠️ Make a ggplot2 extension!

Extending ggplot2

     by Hadley Wickham

How to make a generic stat in ggplot2

     by Elio Campitelli

🌟 ggplot2 Internals (WOW!)

     by Brodie Gaslam

17 / 19
  • I wish I had these resources when I started developing ggrepel.

  • Hadley's guide will show you how to make an extension, step by step.

  • Elio's guide will show you how to make a very generic extension that works with any function which accepts a dataframe as input and produces a similar dataframe as output.

  • Finally, if you want to learn more about the internals of ggplot, be sure to look at Brodie's guide. It is the most comprehensive and detailed resource about how ggplot2 works.

📚 Related work

Python

Javascript

18 / 19
  • If you work with Python or Javascript, you might be interested to check out these projects.

  • They offer similar functionality to ggrepel.




These slides are available at:

slowkow.com/ggrepel


Kamil Slowikowski
@slowkow





Made with ⚔ xaringan

19 / 19
  • These slides are available online, so you can follow the links to all the resources I highlighted.

  • Feel free to follow me and ask questions on twitter!

😞 Problem: text placement


library(ggplot2)
ggplot(mtcars) +
aes(
x = wt, y = mpg,
label = rownames(mtcars)
) +
geom_point(color = "red") +
geom_text()

2 / 19
  • Oftentimes, we want to know the identity of each data point in a figure.

  • Unfortunately, geom_text() does not always work as expected:

    • text labels often overlap with each other

    • sometimes they fall outside the plotting area

Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow