ILRI Research Computing

This is an old revision of the document!

########################## # Run Blast using sbatch # ##########################

1) create a new sbatch file (call it blast.sbatch) with the following heading. The #! MUST be the first line of the file. This tells the computer (or in this case snatch) to run the code using the bash shell interpreter.

#!/bin/bash #SBATCH -p highmem #SBATCH -n 8

Normally any other # used in a bash script file means a comment follows. However since we are executing the sbatch file using sbatch command (see below) SBATCH knows to look for #SBATCH and use the information that follows. Here we are telling the SBATCH file to use the highmem partition (the mammoth server) and use 8 CPUs to perform the calculation. It is also important to tell that program you are running that it has 8 CPUs available so it will use them, else you will be reserving 8 CPUs but only using one.

2) There are several different job managers and different ways of setting up HPC computer systems. In practice I prefer not to assume that a job that is submitted to a different server will know how to find the different commands or even files. So I like to include the full paths for both. In order to find where the blastx command is located I simply type:

module load blast which blastx # note this will only work if you already have the blast module loaded. /export/apps/blast/2.2.28+/bin/blastx

From Alan's Blast sbatch script example on the wiki, I also know where the nr database is

/export/data/bio/ncbi/blast/db/nr

3) Edit the contigs.fa.nr.sbatch file to include all changes. Final script looks like this:

#!/bin/bash

#SBATCH -p highmem #SBATCH -n 8

/export/apps/blast/2.2.28+/bin/blastx -db /export/data/bio/ncbi/blast/db/nr -query /home/mkatari/ndl06-132-velvet31/contigs.fa -out /home/mkatari/ndl06-132-velvet31/contigs.fa.nr -num_threads 8 -outfmt 6 -evalue 0.00001

4) Run the sbatch file. As soon as you run the file a job id will be assigned to your submission.

sbatch blast.sbatch

You can check the status of all jobs on the cluster by typing:

squeue

You can check the details of your specific job by typing:

scontrol show job <your jobid>

You can cancel your job by running

scancel <your jobid>

The standard output of your job is redirected to a file called

slurm-<your jobid>.out