This is an old revision of the document!

Diamond

Diamond is a new high-throughput program for aligning DNA reads or protein sequences against a protein reference database such as NR, at up to 20,000 times the speed of BLAST, with high sensitivity.

Information

Latest Version: 0.9.9
Added: November, 2016
Updated: July, 2017
Link: http://ab.inf.uni-tuebingen.de/software/diamond/

Usage

See which versions of diamond are available:

$ module avail diamond

Load the diamond environment module and run it:

$ module load diamond/0.9.9
$ diamond

An example SLURM submission script might look like:

#!/bin/env bash
#SBATCH -p batch
#SBATCH -n 4
#SBATCH -J diamond

# Load diamond module
module load diamond/0.9.9

# Set up environment
export DATADIR=/home/aorth/data
export WORKDIR=/var/scratch/aorth/diamond-2017-08-17

# Create and change to working directory
mkdir -p $WORKDIR
cd $WORKDIR
 
diamond blastx -p 4 -q $DATADIR/test_nt_seq.fa --sensitive -d /export/data/bio/diamond/nr -o test_nt_seq.xml -f 5

Make sure to match the number of CPUs in your SLURM request (-n) with the amount in your diamond command line (-p).

Installation

Notes from the sysadmin during installation:

$ cd /tmp
$ wget https://github.com/bbuchfink/diamond/releases/download/v0.9.9/diamond-linux64.tar.gz
$ tar xf diamond-linux64.tar.gz
$ sudo cp diamond diamond_manual.pdf /export/apps/diamond/0.9.9

Diamond requires specially formatted databases, which you create using the diamond makedb subcommand. The input must be a FASTA file, but can be gzip compressed, for example the NR database from NCBI:

$ wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz
$ diamond makedb --in nr.gz -d nr
$ sudo mkdir -p /export/data/bio/diamond
$ sudo cp nr.dmnd /export/data/bio/diamond

ILRI Research Computing

Table of Contents

Diamond

Information

Usage

Installation