User Tools

Site Tools


mkatari-bioinformatics-august-2013-bowtienotes

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
mkatari-bioinformatics-august-2013-bowtienotes [2014/04/30 15:29]
mkatari
mkatari-bioinformatics-august-2013-bowtienotes [2014/07/03 16:19] (current)
mkatari
Line 1: Line 1:
 [[mkatari-bioinformatics-august-2013|Back to Manny'​s Bioinformatics Workshop Home]] [[mkatari-bioinformatics-august-2013|Back to Manny'​s Bioinformatics Workshop Home]]
  
-====== Creating ​and Running ​Bowtie using Cassava ======+====== Creating ​Bowtie ​Index and Performing an alignment ​using Cassava ​as a reference ​======
  
-Make you are in "interactive mode"+This is a quick example of how to build a bowtie index and executing bowtie. Normally you will create the index only once you there is no need to create a special script for it. Just make sure you are in **interactive mode** mode and we can do everything on the command line.
  
-1) Download ​cassava ​genome from phytozome+Once you have found the sequence on the web you want to use as a reference you can use the linux command wget to download it quickly. And to make sure we keep our data organized, let's create a directory called **cassava** where we will store the file. 
 + 
 +So first steps are to create the directory and download the file.
  
 <​code>​ <​code>​
 +mkdir ~/cassava
 +
 +cd cassava
 +
 wget ftp://​ftp.jgi-psf.org/​pub/​compgen/​phytozome/​v9.0/​Mesculenta/​assembly/​Mesculenta_147.fa.gz wget ftp://​ftp.jgi-psf.org/​pub/​compgen/​phytozome/​v9.0/​Mesculenta/​assembly/​Mesculenta_147.fa.gz
 </​code>​ </​code>​
  
-2) created a directory called cassava +Now we will uncompress the file using gunzip
-<​code>​ +
-mkdir cassava +
-</​code>​+
  
-3) move fasta sequence in the directory 
 <​code>​ <​code>​
-mv Mesculenta_147.fa.gz cassava/ 
-</​code>​ 
-4) uncompress using gunzip 
-<​code>​ 
-cd cassava 
 gunzip Meculenta_147.fa.gz gunzip Meculenta_147.fa.gz
 </​code>​ </​code>​
-5) Count number of scaffold ​in the file+ 
 +To see how many scaffolds we in our file we can **grep** for the greater-than sign and then using the command **wc** to count lines. 
 <​code>​ <​code>​
-grep ">"​ Mesculenta_147.fa | wc+grep ">"​ Mesculenta_147.fa | wc -l
 </​code>​ </​code>​
-6) load bowtie ​module+ 
 +In order to use Bowtie2 commands we have to first load the module 
 <​code>​ <​code>​
 module load bowtie2 module load bowtie2
 </​code>​ </​code>​
-7) Create ​the bowtie ​index.+ 
 +In order to create ​the index, which will be used like a database when we are aligning the reads to the reference, we use the command bowtie2-build. For cassava this should take about 10 minutes. 
 <​code>​ <​code>​
 bowtie2-build Mesculenta_147.fa cassava bowtie2-build Mesculenta_147.fa cassava
 </​code>​ </​code>​
-8) Run Bowtie using single end fastq as input. See bowtie2 for instructions on how to run it on pair-end ​sequences.+ 
 +Based on the the type of sequences you have ( single end or pair end ) the options ​to run bowtie are slightly different. To run single end reads use -U followed by the file name. If it is pair-end ​then use -1 first file -2 second file.  
 +See the link for [[http://​bowtie-bio.sourceforge.net/​bowtie2/​manual.shtml|bowtie2]] for more detailed options and arguments that bowtie2 accepts. 
 <​code>​ <​code>​
 bowtie2 -x cassava/​cassava -U test.fastq -S test.sam bowtie2 -x cassava/​cassava -U test.fastq -S test.sam
 </​code>​ </​code>​
mkatari-bioinformatics-august-2013-bowtienotes.txt · Last modified: 2014/07/03 16:19 by mkatari