====== Diamond ====== Diamond is a new high-throughput program for aligning DNA reads or protein sequences against a protein reference database such as NR, at up to 20,000 times the speed of BLAST, with high sensitivity. ===== Information ===== * Version: 2.0.14 * Added: November, 2016 * Updated: March, 2022 * Link: https://github.com/bbuchfink/diamond ===== Usage ===== See which versions are available: $ module avail diamond Load one version into your environment and run it: $ module load diamond/2.0.14 $ diamond An example SLURM submission script might look like: #!/bin/env bash #SBATCH -p batch #SBATCH -n 4 #SBATCH -J diamond # Load diamond module module load diamond/0.9.9 # Set up environment export DATADIR=/home/aorth/data export WORKDIR=/var/scratch/aorth/diamond-2017-08-17 # Create and change to working directory mkdir -p $WORKDIR cd $WORKDIR diamond blastx -p 4 -q $DATADIR/test_nt_seq.fa --sensitive -d /export/data/bio/diamond/nr -o test_nt_seq.xml -f 5 Make sure to match the number of CPUs in your SLURM request (''-n'') with the amount in your diamond command line (''-p''). ===== Installation ====== Notes from the sysadmin during installation: $ cd /tmp $ wget http://github.com/bbuchfink/diamond/releases/download/v2.0.14/diamond-linux64.tar.gz $ sudo mkdir -p /export/apps/diamond/2.0.14/bin $ tar xf diamond-linux64.tar.gz $ sudo cp diamond /export/apps/diamond/2.0.14/bin Diamond requires specially formatted databases, which you create using the ''diamond makedb'' subcommand. The input must be a FASTA file, but can be gzip compressed, for example the NR database from NCBI: $ wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz $ wget ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/accession2taxid/prot.accession2taxid.FULL.gz $ wget ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdmp.zip $ unzip taxdmp.zip $ diamond makedb --in nr.gz --db nr --taxonmap prot.accession2taxid.FULL.gz --taxonnodes nodes.dmp --taxonnames names.dmp $ sudo mkdir -p /export/data/bio/diamond $ sudo cp nr.dmnd /export/data/bio/diamond