Extract data from a PDF file with Tabula
Kirkham et al. 2006 is a prospective 2-year study of 60 patients with rheumatoid arthritis (RA). It shows that “synovial membrane cytokine mRNA expression is predictive of joint damage progression in RA”. The PDF includes a few tables with data on cytokine measurements and correlations with joint damage. Here, we’ll use Tabula to extract data from tables in the PDF file. Then we’ll make figures with R.
Make heatmaps in R with pheatmap
Here are a few tips for making heatmaps with the pheatmap R package by Raivo Kolde. We’ll use quantile color breaks, so each color represents an equal proportion of the data. We’ll also cluster the data with neatly sorted dendrograms, so it’s easy to see which samples are closely or distantly related.
Color points by density with ggplot2
Build bioinformatics pipelines with Snakemake
Snakemake is a Pythonic variant of GNU Make. Recently, I learned how to use it to build and launch bioinformatics pipelines on an LSF cluster. However, I had trouble understanding the documentation for Snakemake. I like to learn by trying simple examples, so this post will walk you through a very simple pipeline step by step. If you already know how to use Snakemake, then you might be interested to copy my Snakefiles for RNA-seq data analysis here.
Determine if a transcription factor is bound to a genomic site with CENTIPEDE
How to ssh to a remote server without typing your password
Here are a few tips to use
ssh more effectively. Login to your server using
public key encryption instead of typing a password. Use the
file to create short and memorable aliases for your servers. Also, use aliases
to connect through a login server into a work server.