hpc_concepts
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
hpc_concepts [2009/11/05 07:54] – 172.26.0.166 | hpc_concepts [2009/11/05 09:19] – 172.26.0.166 | ||
---|---|---|---|
Line 3: | Line 3: | ||
---- | ---- | ||
- | The MPI interface is meant to provide essential virtual topology, synchronization, | + | |
+ | The MPI interface is meant to provide essential virtual topology, synchronization, | ||
+ | http:// | ||
HPC environments are often measured in terms of FLoating point OPerations per Second (FLOPS) | HPC environments are often measured in terms of FLoating point OPerations per Second (FLOPS) | ||
Line 10: | Line 12: | ||
---- | ---- | ||
+ | |||
+ | |||
Machines sit idle for long periods of time, often while their users are busy doing other things. **Condor takes this wasted computation time and puts it to good use**. The situation today matches that of yesterday, with the addition of clusters in the list of resources. These machines are often dedicated to tasks. Condor manages a cluster' | Machines sit idle for long periods of time, often while their users are busy doing other things. **Condor takes this wasted computation time and puts it to good use**. The situation today matches that of yesterday, with the addition of clusters in the list of resources. These machines are often dedicated to tasks. Condor manages a cluster' | ||
http:// | http:// | ||
+ | **Sun Grid Engine (SGE)** | ||
+ | ---- | ||
+ | |||
+ | |||
+ | SGE is typically used on a computer farm or high-performance computing (HPC) cluster and is responsible for **accepting, | ||
+ | http:// | ||
+ | |||
+ | |||
+ | |||
+ | ====SLURM: A Highly Scalable Resource Manager==== | ||
+ | |||
+ | SLURM is an open-source resource manager designed for Linux clusters of all sizes. It provides three key functions. First it allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job) on a set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work. |
hpc_concepts.txt · Last modified: 2010/05/22 14:19 by 127.0.0.1