User Tools

Site Tools


mkatari-bioinformatics-august-2013-more-slurm

This is an old revision of the document!


Back to Manny's Bioinformatics Workshop Home

Some more useful SLURM notes

Reading Variable from command line

Building on Alan's notes on Using SLURM we can also provide an option to input the variables on the command line. Basically if you have ten different files you want to execute separately, you don't want to have to create 10 different files. One script should be able to do the job, but you simply must provide the input to the script.

In shell scripting, the first word following the name of your script is given the variable $1. Similarly the second word will be assigned $2. As a rule you must document to the user of the script what is expected in the first argument and also the following because in this case order matters.

In the script you can access the variables, modify them, and also create new ones. In the example below, the script is accepting fasta file as $INPUT and it is creating a new variable $OUTPUT to specify the where the output should be stored. Note then when variables are being assigned there is $ in front.

The echo statements print a message to the slurm report. This can be useful to make sure the job as ended successfully.

#!/bin/env bash                                                                 
#SBATCH -p batch                                                                
#SBATCH -J blastn                                                               
#SBATCH -n 4                                                                    

INPUT=$1
OUTPUT=$INPUT.output

module load blast/2.2.28+

echo "Ready to run Blast"

blastn -query $INPUT -db nt -out $OUTPUT -num_threads 4

echo "Blast done"

Generating sbatch scripts on the fly

#!/bin/env bash                                                                                      
#SBATCH -p batch                                                                                     
#SBATCH -J blastn                                                                                    
#SBATCH -n 4                                                                                         

#results of the ls command is captured in the variable FILES                                         
FILES=`ls test*fa`

#loop through all files in FILES and each iteration, INPUT will have name of one file                
for INPUT in $FILES
do

#this line gets printed into screen                                                                  
echo "file name "$INPUT

#creating variables to store values                                                                  
SBATCH=$INPUT.blast.sbatch
OUTPUT=$INPUT.output

#the following echo is going to be saved in sbatch file to be executed later                         
echo "#!/bin/env bash                                                                                
#SBATCH -p batch                                                                                     
#SBATCH -J blastn                                                                                    
#SBATCH -n 4                                                                                      
                                                                                                     
module load blast/2.2.28+                                                                            
                                                                                                     
echo \"Ready to run Blast\"                                                                          
                                                                                                     
blastn -query $INPUT -db nt -out $OUTPUT -num_threads 4                                              
echo \"Blast Done\"                                                                                  
" > $SBATCH

#now that the file is done writing, execute the sbatch file                                          
sbatch $SBATCH

#end of the loop. code will be repeated (starting at "do") until all files in FILES is done.         
done
mkatari-bioinformatics-august-2013-more-slurm.1402300725.txt.gz · Last modified: 2014/06/09 07:58 by mkatari