mkatari-bioinformatics-august-2013-blastnotes
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
mkatari-bioinformatics-august-2013-blastnotes [2013/08/13 15:29] – mkatari | mkatari-bioinformatics-august-2013-blastnotes [2014/07/02 13:58] – mkatari | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | [[mkatari-bioinformatics-august-2013|Back to Manny' | ||
+ | |||
+ | ====== Getting specific fasta sequences from reference file ====== | ||
+ | |||
+ | In case you need to retrieve a specific sequence from a larger fasta file, use one of my perl scripts. Simply provide a pattern that it needs to match in the definition field and it will retrieve the sequence. The **-f** is the reference file, **-o** is the output file name, and **-query** is the pattern it will try to match. | ||
+ | |||
+ | < | ||
+ | perl / | ||
+ | -o scaffold12498.fa \ | ||
+ | -query scaffold12498 | ||
+ | </ | ||
+ | |||
+ | ====== Creating the blast database ====== | ||
+ | |||
+ | Before we can actually perform the blast, we need to prepare the database using **makeblastdb**. | ||
+ | |||
+ | < | ||
+ | makeblastdb -in cassavaV5 -input_type " | ||
+ | </ | ||
+ | |||
====== Run Blast using sbatch ====== | ====== Run Blast using sbatch ====== | ||
Line 67: | Line 87: | ||
</ | </ | ||
+ | * Now imagine that you have to repeat this exact blast for many different sequences but you do not necessarily want to have to create a new batch file or keep editing the same one. The path to the input and output files in our current sbatch files are "hard coded" | ||
+ | * Lucky for us we can define variables in a bash script and provide the value of the variable from the command line. A modified version of the sbatch script is provided below. | ||
+ | |||
+ | < | ||
+ | #!/bin/bash | ||
+ | |||
+ | #SBATCH -p batch | ||
+ | #SBATCH -n 8 | ||
+ | |||
+ | INPUT=$1 | ||
+ | OUTPUT=" | ||
+ | |||
+ | echo $INPUT | ||
+ | echo $OUTPUT | ||
+ | |||
+ | / | ||
+ | </ | ||
+ | |||
+ | Arguments on a command line are interpreted by the bash script in sequence. The values automatically inherit the variable $1, $2, $3 ... as they are read from command line. It is a good idea to reassign these with variables that have names that make sense to us. Any string of characters (without spaces) provided after the script name will be assigned as $1 and then the variable INPUT will be assigned this value. In the script above we also see how to create a new variable OUTPUT which contains the same information as INPUT but now also contains a " | ||
+ | |||
+ | Now to refer to the value saved in the variables we simply put $ infront as shown in the blast command line. | ||
+ | |||
+ | To execute this sbatch file you would simply provide the name of the input file as shown below. | ||
+ | |||
+ | < | ||
+ | sbatch / | ||
+ | </ |
mkatari-bioinformatics-august-2013-blastnotes.txt · Last modified: 2015/06/04 12:38 by mkatari