hpc_concepts
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionLast revisionBoth sides next revision | ||
hpc_concepts [2009/11/05 07:54] – 172.26.0.166 | hpc_concepts [2009/11/17 11:40] – 172.26.0.166 | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | **Message Passing Interface (MPI): The Concept** | + | ==== Message Passing Interface (MPI): The Concept |
---- | ---- | ||
- | The MPI interface is meant to provide essential virtual topology, synchronization, | + | |
+ | The MPI interface is meant to provide essential virtual topology, synchronization, | ||
+ | http:// | ||
+ | |||
+ | If you are simply looking for how to run an MPI application, | ||
+ | |||
+ | < | ||
+ | |||
+ | This will run X copies of < | ||
+ | | ||
+ | | ||
+ | the use of a hostfile, or will default to running all X copies on the localhost), scheduling | ||
+ | | ||
+ | | ||
+ | === Installation === | ||
+ | ---- | ||
+ | |||
+ | |||
+ | < | ||
+ | $ tar xfz openmpi-1.3.3.tar.gz | ||
+ | $ cd openmpi-1.3.3 | ||
+ | $ ./ | ||
+ | $ make && make install </ | ||
HPC environments are often measured in terms of FLoating point OPerations per Second (FLOPS) | HPC environments are often measured in terms of FLoating point OPerations per Second (FLOPS) | ||
- | **Condor** | + | ==== Condor |
---- | ---- | ||
+ | |||
+ | |||
Machines sit idle for long periods of time, often while their users are busy doing other things. **Condor takes this wasted computation time and puts it to good use**. The situation today matches that of yesterday, with the addition of clusters in the list of resources. These machines are often dedicated to tasks. Condor manages a cluster' | Machines sit idle for long periods of time, often while their users are busy doing other things. **Condor takes this wasted computation time and puts it to good use**. The situation today matches that of yesterday, with the addition of clusters in the list of resources. These machines are often dedicated to tasks. Condor manages a cluster' | ||
http:// | http:// | ||
+ | ==== Sun Grid Engine (SGE) ==== | ||
+ | ---- | ||
+ | |||
+ | |||
+ | SGE is typically used on a computer farm or high-performance computing (HPC) cluster and is responsible for **accepting, | ||
+ | http:// | ||
+ | |||
+ | |||
+ | ==== SLURM: A Highly Scalable Resource Manager ==== | ||
+ | |||
+ | SLURM is an open-source resource manager designed for Linux clusters of all sizes. It provides three key functions. First it allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job) on a set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work. https:// | ||
+ | |||
+ | |||
+ | ==== TORQUE Resource Manager ==== | ||
+ | |||
+ | TORQUE is an open source resource manager providing control over batch jobs and distributed compute nodes. It is a community effort based on the original *PBS project and, with more than 1,200 patches, has incorporated significant advances in the areas of scalability, | ||
+ | |||
+ | ==== Platfrom LSF ==== | ||
+ | |||
+ | [[platform_lsf|LSF]] is implemented as a resource manager for the HPC together with SGE. |
hpc_concepts.txt · Last modified: 2010/05/22 14:19 by 127.0.0.1