User Tools

Site Tools


hpc_concepts

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
hpc_concepts [2009/11/05 09:19] 172.26.0.166hpc_concepts [2010/05/22 14:19] (current) – external edit 127.0.0.1
Line 1: Line 1:
-**Message Passing Interface (MPI): The Concept**+==== Message Passing Interface (MPI): The Concept ====
  
 ---- ----
Line 6: Line 6:
 The MPI interface is meant to provide essential virtual topology, synchronization, and communication functionality between a set of processes (that have been mapped to nodes/servers/computer instances) in a language-independent way, with language-specific syntax (bindings), plus a few features that are language-specific. MPI programs always work with processes, but programmers commonly refer to the processes as processors. Typically, for maximum performance, each CPU (or core in a multicore machine) will be assigned just a single process. This assignment happens at runtime through the agent that starts the MPI program, normally called **//mpirun//** or **//mpiexec//**.  The MPI interface is meant to provide essential virtual topology, synchronization, and communication functionality between a set of processes (that have been mapped to nodes/servers/computer instances) in a language-independent way, with language-specific syntax (bindings), plus a few features that are language-specific. MPI programs always work with processes, but programmers commonly refer to the processes as processors. Typically, for maximum performance, each CPU (or core in a multicore machine) will be assigned just a single process. This assignment happens at runtime through the agent that starts the MPI program, normally called **//mpirun//** or **//mpiexec//**. 
 http://www.open-mpi.org/software/ompi/v1.3/ http://www.open-mpi.org/software/ompi/v1.3/
 +
 +If you are simply looking for how to run an MPI application, you probably want to use a command line of the following form:
 +
 +<file>        shell$ mpirun [ -np X ] [ --hostfile <filename> ]  <program>
 +
 +       This will run X copies of <program> in your current run-time environment (if running under a
 +       supported resource manager, Open MPI’s mpirun will usually automatically use the correspond-
 +       ing  resource manager process starter, as opposed to, for example, rsh or ssh, which require
 +       the use of a hostfile, or will default to running all X copies on the localhost), scheduling
 +       (by  default)  in  a  round-robin  fashion  by CPU slot.  See the rest of this page for more
 +       details. </file>
 +=== Installation ===
 +----
 +
 +
 +<file>$ wget http://www.open-mpi.org/software/ompi/v1.3/downloads/openmpi-1.3.3 
 +$ tar xfz openmpi-1.3.3.tar.gz 
 +$ cd openmpi-1.3.3
 +$ ./configure
 +$ make && make install </file>
  
 HPC environments are often measured in terms of FLoating point OPerations per Second (FLOPS) HPC environments are often measured in terms of FLoating point OPerations per Second (FLOPS)
  
-**Condor**+==== Condor ====
  
 ---- ----
Line 17: Line 37:
 http://www.cs.wisc.edu/condor/downloads-v2/download.pl http://www.cs.wisc.edu/condor/downloads-v2/download.pl
  
-**Sun Grid Engine (SGE)**+==== Sun Grid Engine (SGE) ====
 ---- ----
  
Line 25: Line 45:
  
  
 +==== SLURM: A Highly Scalable Resource Manager ====
 +
 +SLURM is an open-source resource manager designed for Linux clusters of all sizes. It provides three key functions. First it allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job) on a set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work. https://computing.llnl.gov/linux/slurm/
 +
 +
 +==== TORQUE Resource Manager ====
 +
 +TORQUE is an open source resource manager providing control over batch jobs and distributed compute nodes. It is a community effort based on the original *PBS project and, with more than 1,200 patches, has incorporated significant advances in the areas of scalability, fault tolerance, and feature extensions contributed by NCSA, OSC, USC , the U.S http://www.clusterresources.com/pages/products/torque-resource-manager.php
  
-====SLURM: A Highly Scalable Resource Manager====+==== Platfrom LSF ====
  
-SLURM is an open-source resource manager designed for Linux clusters of all sizes. It provides three key functions. First it allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job) on a set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work+[[platform_lsf|LSF]] is implemented as a resource manager for the HPC together with SGE
hpc_concepts.1257412770.txt.gz · Last modified: 2010/05/22 14:19 (external edit)