mkatari-bioinformatics-august-2013-blastnotes
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
mkatari-bioinformatics-august-2013-blastnotes [2014/07/02 13:58] – mkatari | mkatari-bioinformatics-august-2013-blastnotes [2014/07/11 09:04] – mkatari | ||
---|---|---|---|
Line 6: | Line 6: | ||
< | < | ||
- | perl / | + | perl / |
+ | | ||
-o scaffold12498.fa \ | -o scaffold12498.fa \ | ||
-query scaffold12498 | -query scaffold12498 | ||
Line 13: | Line 14: | ||
====== Creating the blast database ====== | ====== Creating the blast database ====== | ||
- | Before we can actually perform the blast, we need to prepare the database using **makeblastdb**. | + | Before we can actually perform the blast, we need to prepare the database using **makeblastdb**. |
< | < | ||
makeblastdb -in cassavaV5 -input_type " | makeblastdb -in cassavaV5 -input_type " | ||
+ | </ | ||
+ | |||
+ | ====== Running Blast ====== | ||
+ | |||
+ | Now to run blast we simply have to specific which blast we want to use and provide the respective arguments. For this example we are looking to see where the scaffold from an older version of the assembly is present in the new assembly. So we are aligning a nucleotide query to a nucleotide database ( **blastn** ). There are incredible number of options for blastn. To get a detailed description of all the different option type ( **blastn -help** ). | ||
+ | |||
+ | Some of the options I find useful are: | ||
+ | -query = the name of the query sequence | ||
+ | -db = the name of the database | ||
+ | -out = the name of the output file | ||
+ | -outfmt = in which format to save the file. The default is the traditional output that shows alignments, but I also use value outfmt 6, which will save in tabular format. | ||
+ | |||
+ | < | ||
+ | blastn -query scaffold12498.fa \ | ||
+ | -db cassavaV5 \ | ||
+ | -outfmt 6 \ | ||
+ | -out scaffold12498.cassavaV5.bout | ||
</ | </ | ||
Line 102: | Line 120: | ||
echo $OUTPUT | echo $OUTPUT | ||
- | / | + | / |
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | |||
+ | # the different columns in this output format are : | ||
+ | # Fields: query id, | ||
+ | subject id, | ||
+ | % identity, | ||
+ | alignment length, | ||
+ | mismatches, | ||
+ | gap opens, | ||
+ | query start, | ||
+ | query end, | ||
+ | subject start, | ||
+ | subject end, | ||
+ | evalue, | ||
+ | bit score | ||
</ | </ | ||
mkatari-bioinformatics-august-2013-blastnotes.txt · Last modified: 2015/06/04 12:38 by mkatari