Kamil Slowikowski
About Archive

Run Picard tools and collate multiple metrics files

Picard is a set of Java command line tools for manipulating high-throughput sequencing (HTS) data files such as BAM and VCF. I needed to check the quality of thousands of BAM files, so I created a Bash script called picardmetrics. It runs 10 of the Picard tools on a BAM file and easily collates all of the generated metrics files into a single table. I also include utility scripts for generating the reference files required for Picard.

featureCounts requires identical mate ids

featureCounts, a read-counting program, requires identical mate ids to identify a pair of read mates as correctly paired. However, FASTQ files generated from an SRA file with fastq-dump have different mate ids for each mate in a pair. The forward and reverse mate ids end with .1 and .2, respectively. I wrote a bash function to fix BAM files with this problem.