If you’re like me, you get tired of waiting for long-running jobs in the terminal. You run a new command, and you don’t really know how long it should take to finish. …
hlabud provides methods to retrieve sequence alignment data from IMGTHLA and convert the data into convenient R matrices ready for downstream analysis. See the usage examples to …
Benchmark principal component analysis (PCA) of scRNA-seq data in R
Principal component analysis (PCA) is frequently used for analysis of single-cell RNA-seq (scRNA-seq) data. We can use it to reduce the dimensionality of a large matrix with …
Make a table with ligands and receptors in R with OmnipathR
Curated lists of genes help computational biologists to focus analyses on a subset of genes that might be important for a research question. For example, we might be interested to …
Some grant agencies might require a table that lists all of your coauthors, departments, and dates for publications from the last few years. Making such a table can be a laborious …
Sparse matrices are necessary for dealing with large single-cell RNA-seq datasets. They require less memory than dense matrices, and they allow some computations to be more …
Harmony in motion: visualize an iterative algorithm for aligning multiple datasets
Harmony is a an algorithm for aligning multiple high-dimensional datasets, described by Ilya Korsunsky et al. in this paper. When analyzing multiple single-cell RNA-seq datasets, …
Kirkham et al. 2006 is a prospective 2-year study of 60 patients with rheumatoid arthritis (RA). It shows that “synovial membrane cytokine mRNA expression is predictive of …
Sometimes we need a lot of colors to represent all the categories in our data. We can use the httr and jsonlite packages to retrieve a list of colors from the Colorgorical website …
Here are a few tips for making heatmaps with the pheatmap R package by Raivo Kolde. We’ll use quantile color breaks, so each color represents an equal proportion of the data. …
Here, we use the 2D kernel density estimation function from the MASS R package to to color points by density in a plot created with ggplot2. This helps us to see where most of the …
View the primary genomics data from several biomedical research studies. I developed many of the data visualizations on this site with R and Javascript. You can view bulk RNA-seq, …
ggrepel: Automatically Position Non-Overlapping Text Labels with 'ggplot2'
ggrepel is an R package that provides geoms for ggplot2 to repel overlapping text labels:
geom_text_repel() geom_label_repel() Text labels repel away from each other, away from …
Determine if a transcription factor is bound to a genomic site with CENTIPEDE
I wrote a practical tutorial for how to use CENTIPEDE to determine if a transcription factor is bound to a site in the genome. The tutorial explains how to prepare appropriate …
I made a data package with human transcription factor target genes for use in R. It is a collection of data from three sources: TRED, ITFP, and ENCODE. I use them to test if the …
In genomics data, we often have multiple measurements for each gene. Sometimes we want to aggregate those measurements with the mean, median, or sum. The data.table R package can …
After performing many tests for statistical significance, the next step is to check if any results are more extreme than we would expect by random chance. One way to do this is by …