User Tools

Site Tools


mkatari-bioinformatics-august-2013-blastnotes

This is an old revision of the document!


########################## # Run Blast using sbatch # ##########################

1) create a new sbatch file (call it blast.sbatch) with the following heading. The #! MUST be the first line of the file. This tells the computer (or in this case snatch) to run the code using the bash shell interpreter.

#!/bin/bash #SBATCH -p highmem #SBATCH -n 8

Normally any other # used in a bash script file means a comment follows. However since we are executing the sbatch file using sbatch command (see below) SBATCH knows to look for #SBATCH and use the information that follows. Here we are telling the SBATCH file to use the highmem partition (the mammoth server) and use 8 CPUs to perform the calculation. It is also important to tell that program you are running that it has 8 CPUs available so it will use them, else you will be reserving 8 CPUs but only using one.

2) There are several different job managers and different ways of setting up HPC computer systems. In practice I prefer not to assume that a job that is submitted to a different server will know how to find the different commands or even files. So I like to include the full paths for both. In order to find where the blastx command is located I simply type:


module load blast which blastx # note this will only work if you already have the blast module loaded. /export/apps/blast/2.2.28+/bin/blastx


From Alan's Blast sbatch script example on the wiki, I also know where the nr database is


/export/data/bio/ncbi/blast/db/nr


3) Edit the contigs.fa.nr.sbatch file to include all changes. Final script looks like this:


#!/bin/bash

#SBATCH -p highmem #SBATCH -n 8

/export/apps/blast/2.2.28+/bin/blastx -db /export/data/bio/ncbi/blast/db/nr -query /home/mkatari/ndl06-132-velvet31/contigs.fa -out /home/mkatari/ndl06-132-velvet31/contigs.fa.nr -num_threads 8 -outfmt 6 -evalue 0.00001


4) Run the sbatch file. As soon as you run the file a job id will be assigned to your submission.


sbatch blast.sbatch


You can check the status of all jobs on the cluster by typing:


squeue


You can check the details of your specific job by typing:


scontrol show job <your jobid>


You can cancel your job by running


scancel <your jobid>


The standard output of your job is redirected to a file called


slurm-<your jobid>.out


mkatari-bioinformatics-august-2013-blastnotes.1376407099.txt.gz · Last modified: 2013/08/13 15:18 by mkatari