R

Get notifications on desktop and mobile from long-running jobs in your terminal sessions

2 December 2025·847 words·4 mins

Bash R Python

If you’re like me, you get tired of waiting for long-running jobs in the terminal. You run a new command, and you don’t really know how long it should take to finish. …

hlabud: HLA genotype analysis in R

5 April 2023·71 words·1 min

hlabud provides methods to retrieve sequence alignment data from IMGTHLA and convert the data into convenient R matrices ready for downstream analysis. See the usage examples to …

Benchmark principal component analysis (PCA) of scRNA-seq data in R

24 January 2022·1896 words·9 mins

R Tutorial

Principal component analysis (PCA) is frequently used for analysis of single-cell RNA-seq (scRNA-seq) data. We can use it to reduce the dimensionality of a large matrix with …

Make a table with ligands and receptors in R with OmnipathR

24 November 2020·4860 words·23 mins

R Tutorial

Curated lists of genes help computational biologists to focus analyses on a subset of genes that might be important for a research question. For example, we might be interested to …

Make a table with your most recent coauthors in R

13 August 2020·538 words·3 mins

R Tutorial

Some grant agencies might require a table that lists all of your coauthors, departments, and dates for publications from the last few years. Making such a table can be a laborious …

Working with a sparse matrix in R

11 March 2020·2272 words·11 mins

R Tutorial

Sparse matrices are necessary for dealing with large single-cell RNA-seq datasets. They require less memory than dense matrices, and they allow some computations to be more …

Harmony in motion: visualize an iterative algorithm for aligning multiple datasets

25 August 2019·2487 words·12 mins

R Tutorial

Harmony is a an algorithm for aligning multiple high-dimensional datasets, described by Ilya Korsunsky et al. in this paper. When analyzing multiple single-cell RNA-seq datasets, …

Extract data from a PDF file with Tabula

29 December 2018·358 words·2 mins

R Data Rheumatoid-Arthritis Tutorial

Kirkham et al. 2006 is a prospective 2-year study of 60 patients with rheumatoid arthritis (RA). It shows that “synovial membrane cytokine mRNA expression is predictive of …

Generate a large color palette with Colorgorical

23 July 2018·222 words·2 mins

R Tutorial

Sometimes we need a lot of colors to represent all the categories in our data. We can use the httr and jsonlite packages to retrieve a list of colors from the Colorgorical website …

Make heatmaps in R with pheatmap

16 February 2017·932 words·5 mins

R Tutorial

Here are a few tips for making heatmaps with the pheatmap R package by Raivo Kolde. We’ll use quantile color breaks, so each color represents an equal proportion of the data. …

Color points by density with ggplot2

17 January 2017·396 words·2 mins

R Tutorial

Here, we use the 2D kernel density estimation function from the MASS R package to to color points by density in a plot created with ggplot2. This helps us to see where most of the …

Immunogenomics.io

9 January 2016·36 words·1 min

R Javascript

View the primary genomics data from several biomedical research studies. I developed many of the data visualizations on this site with R and Javascript. You can view bulk RNA-seq, …

ggrepel: Automatically Position Non-Overlapping Text Labels with 'ggplot2'

9 January 2016·106 words·1 min

ggrepel is an R package that provides geoms for ggplot2 to repel overlapping text labels: geom_text_repel() geom_label_repel() Text labels repel away from each other, away from …

Determine if a transcription factor is bound to a genomic site with CENTIPEDE

3 June 2015·106 words·1 min

R Tutorial

I wrote a practical tutorial for how to use CENTIPEDE to determine if a transcription factor is bound to a site in the genome. The tutorial explains how to prepare appropriate …

Get human transcription factor target genes

5 March 2015·380 words·2 mins

R Tutorial

I made a data package with human transcription factor target genes for use in R. It is a collection of data from three sources: TRED, ITFP, and ENCODE. I use them to test if the …

Quickly aggregate your data in R with data.table

28 January 2015·429 words·3 mins

In genomics data, we often have multiple measurements for each gene. Sometimes we want to aggregate those measurements with the mean, median, or sum. The data.table R package can …

Create a quantile-quantile plot with ggplot2

16 February 2014·652 words·4 mins

After performing many tests for statistical significance, the next step is to check if any results are more extreme than we would expect by random chance. One way to do this is by …