User Tools

Site Tools


mpiblast

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
mpiblast [2010/01/18 10:14] 172.26.0.166mpiblast [2010/01/19 12:51] 172.26.0.166
Line 3: Line 3:
  
 http://wiki.bioinformatics.ucdavis.edu/index.php/MPI_Blast http://wiki.bioinformatics.ucdavis.edu/index.php/MPI_Blast
 +OpenMPI FAQ: http://www.open-mpi.org/faq/
  
 <code>$ mpiformatdb -i drosoph.nt -p F --nfrags=12</code> <code>$ mpiformatdb -i drosoph.nt -p F --nfrags=12</code>
  
-  * **--nfrags=10** Specifies how many database fragments you want to split the original database into. This should be equal to how many different nodes you want to run mpiblast on. +  * **nfrags** specifies how many database fragments you want to split the original database into. This should be equal to how many different nodes you want to run mpiblast on. 
  
-http://www.ncbi.nlm.nih.gov/blast/docs/update_blastdb.pl 
  
-===== Updating BLAST Databases ===== 
-http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/blastdb.html 
 ===== Notes on .ncbirc ==== ===== Notes on .ncbirc ====
 Notes on setting up the ''~/.ncbirc'' file from the mpiBLAST installation page: http://www.mpiblast.org/Docs/Install#unix Notes on setting up the ''~/.ncbirc'' file from the mpiBLAST installation page: http://www.mpiblast.org/Docs/Install#unix
Line 36: Line 34:
 ===== Frequently Asked Questions ===== ===== Frequently Asked Questions =====
 Collection of the more-helpful questions and answers from the [[http://www.mpiblast.org/Docs/FAQ|mpiBLAST FAQ]]. Collection of the more-helpful questions and answers from the [[http://www.mpiblast.org/Docs/FAQ|mpiBLAST FAQ]].
- 
 ====How do I format a huge database?==== ====How do I format a huge database?====
  
 Large databases like nt can consume several gigabytes of disk space and it is preferable to store them in compressed form. Starting with mpiBLAST 1.4.0 it is possible to pipe FastA formatted sequence data into mpiformatdb. This feature provides the ability to directly format a compressed (gzip/bzip etc.) database using command line syntax like: Large databases like nt can consume several gigabytes of disk space and it is preferable to store them in compressed form. Starting with mpiBLAST 1.4.0 it is possible to pipe FastA formatted sequence data into mpiformatdb. This feature provides the ability to directly format a compressed (gzip/bzip etc.) database using command line syntax like:
 <code>$ zcat nt.gz | mpiformatdb -i stdin -N 100 -t nt -p F</code> <code>$ zcat nt.gz | mpiformatdb -i stdin -N 100 -t nt -p F</code>
 +
 +==== SGE Support ====
 +See this FAQ entry: http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge
 +
 +<code>$ ompi_info | grep gridengine
 +                 MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.3.2)</code>
 +===== Benchmarks =====
 +<code>$ time blastall -d drosoph.nt -p blastn -i drosoph.seq -o drosoph.result
 +        
 +real    7m48.052s
 +user    7m40.775s
 +sys     0m6.732s</code>
 +
 +<code>$ time /opt/openmpi/bin/mpirun -np 4 /opt/Bio/mpiblast/bin/mpiblast -d drosoph.nt -i drosoph.seq -p blastn -o mpi_drosoph_result.txt
 +Total Execution Time: 395.754
 +
 +real    6m36.841s
 +user    12m13.891s
 +sys     0m56.631s</code>
 +
 +With 12 jobs, sge, mpiblast, 6 nodes did it in:
 +<code>$ less mpiblast_sge.sh.o5515
 +Total Execution Time: 98.3068</code>
 +
 +<code>$ time pb blastall -d alan_drosoph -p blastn -i sequences/drosoph.seq -o drosoph.result
 +                                                                               
 +real    3m6.163s
 +user    0m0.046s
 +sys     0m1.423s</code>
 +
 +
 +The number of processes for an MPI job should be +1 of the number of CPUs because one process is used as the master to control the other jobs.
 +
 +===== Links =====
 +  * Submitting MPI jobs using SGE: http://www.shef.ac.uk/wrgrid/documents/gridengine.html
 +  * mpiBLAST Guide: http://www.mpiblast.org/Docs/Guide
 +  * Updating the BLAST databases: http://www.ncbi.nlm.nih.gov/blast/docs/update_blastdb.pl