Differences

This shows you the differences between two versions of the page.

--- mkatari-bioinformatics-august-2013-bowtienotes [2014/04/30 12:29] – mkatari
+++ mkatari-bioinformatics-august-2013-bowtienotes [2014/07/03 13:19] (current) – mkatari
@@ Line 1: / Line 1: @@
 [[mkatari-bioinformatics-august-2013|Back to Manny's Bioinformatics Workshop Home]]
-====== Creating and Running Bowtie using Cassava ======
+====== Creating a Bowtie Index and Performing an alignment using Cassava as a reference ======
-Make you are in "interactive mode"
+This is a quick example of how to build a bowtie index and executing bowtie. Normally you will create the index only once you there is no need to create a special script for it. Just make sure you are in **interactive mode** mode and we can do everything on the command line.
-) Download cassava genome from phytozome
+Once you have found the sequence on the web you want to use as a reference you can use the linux command wget to download it quickly. And to make sure we keep our data organized, let's create a directory called **cassava** where we will store the file.
+So first steps are to create the directory and download the file.
 <code>
+mkdir ~/cassava
+cd cassava
 wget ftp://ftp.jgi-psf.org/pub/compgen/phytozome/v9.0/Mesculenta/assembly/Mesculenta_147.fa.gz
 </code>
-) created a directory called cassava
+Now we will uncompress the file using gunzip
-<code>
-mkdir cassava
-</code>
-) move fasta sequence in the directory
 <code>
-mv Mesculenta_147.fa.gz cassava/
-</code>
-) uncompress using gunzip
-<code>
-cd cassava
 gunzip Meculenta_147.fa.gz
 </code>
-) Count number of scaffold in the file
+To see how many scaffolds we in our file we can **grep** for the greater-than sign and then using the command **wc** to count lines.
 <code>
-grep ">" Mesculenta_147.fa | wc
+grep ">" Mesculenta_147.fa | wc -l
 </code>
-) load bowtie module
+In order to use Bowtie2 commands we have to first load the module
 <code>
 module load bowtie2
 </code>
-) Create the bowtie index.
+In order to create the index, which will be used like a database when we are aligning the reads to the reference, we use the command bowtie2-build. For cassava this should take about 10 minutes.
 <code>
 bowtie2-build Mesculenta_147.fa cassava
 </code>
-) Run Bowtie using single end fastq as input. See bowtie2 for instructions on how to run it on pair-end sequences.
+Based on the the type of sequences you have ( single end or pair end ) the options to run bowtie are slightly different. To run single end reads use -U followed by the file name. If it is pair-end then use -1 first file -2 second file.
+See the link for [[http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml|bowtie2]] for more detailed options and arguments that bowtie2 accepts.
 <code>
 bowtie2 -x cassava/cassava -U test.fastq -S test.sam
 </code>