User Tools

Site Tools


mkatari-bioinformatics-august-2013-more-slurm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
mkatari-bioinformatics-august-2013-more-slurm [2014/06/09 07:53] – created mkatarimkatari-bioinformatics-august-2013-more-slurm [2014/06/09 08:19] (current) mkatari
Line 20: Line 20:
  
 INPUT=$1 INPUT=$1
-OUTPUT=$INPUT.output2+OUTPUT=$INPUT.output
  
 module load blast/2.2.28+ module load blast/2.2.28+
Line 32: Line 32:
 </code> </code>
  
 +=== Generating sbatch scripts on the fly ===
 +
 +In the case where you have have hundreds of files, it is still quite cumbersome to execute the same script manually. In the example below we will get a list of inputs we want to use as input and create a separate sbatch file for each of them.
 +
 +The single quote (not the apostraphe, on the US keyboard it is located to the left of 1) can be used to capture command line results. Here we get a list of fasta files that start with the word test and store it in the variable $FILES. Notice that $FILES is not just one string, but an array of files returned as a result to the ls command. Then we start a loop and work with one file at a time. At each iteration of the for loop the file name will be stored in the variable $INPUT. 
 +
 +The code stays the same. Note that we will echo the entire sbatch script and redirect it into a new sbatch script. Also note that in order for the double quotes to be used as double quotes and not as closure to our echo's double quote we use the escape character \. The escape character tells the shell to interpret any special characters as regular characters. 
 +
 +The code below will create a new sbatch file for each fasta file and then at then submit the jobs. Very useful if you are working on hundred and thousands of files.
 +
 +<code>
 +#!/bin/env bash                                                                                      
 +#SBATCH -p batch                                                                                     
 +#SBATCH -J blastn                                                                                    
 +#SBATCH -n 4                                                                                         
 +
 +#results of the ls command is captured in the variable FILES                                         
 +FILES=`ls test*fa`
 +
 +#loop through all files in FILES and each iteration, INPUT will have name of one file                
 +for INPUT in $FILES
 +do
 +
 +#this line gets printed into screen                                                                  
 +echo "file name "$INPUT
 +
 +#creating variables to store values                                                                  
 +SBATCH=$INPUT.blast.sbatch
 +OUTPUT=$INPUT.output
 +
 +#the following echo is going to be saved in sbatch file to be executed later                         
 +echo "#!/bin/env bash                                                                                
 +#SBATCH -p batch                                                                                     
 +#SBATCH -J blastn                                                                                    
 +#SBATCH -n 4                                                                                      
 +                                                                                                     
 +module load blast/2.2.28+                                                                            
 +                                                                                                     
 +echo \"Ready to run Blast\"                                                                          
 +                                                                                                     
 +blastn -query $INPUT -db nt -out $OUTPUT -num_threads 4                                              
 +echo \"Blast Done\"                                                                                  
 +" > $SBATCH
 +
 +#now that the file is done writing, execute the sbatch file                                          
 +sbatch $SBATCH
 +
 +#end of the loop. code will be repeated (starting at "do") until all files in FILES is done.         
 +done
 +
 +</code>
mkatari-bioinformatics-august-2013-more-slurm.1402300401.txt.gz · Last modified: 2014/06/09 07:53 by mkatari