1
1
<!--#include virtual="header.txt"-->
3
3
<h1>SLURM: A Highly Scalable Resource Manager</h1>
4
<p>SLURM is an open-source resource manager designed for Linux clusters of all
5
sizes. It provides three key functions. First it allocates exclusive and/or non-exclusive
6
access to resources (computer nodes) to users for some duration of time so they
7
can perform work. Second, it provides a framework for starting, executing, and
8
monitoring work (typically a parallel job) on a set of allocated nodes. Finally,
9
it arbitrates contention for resources by managing a queue of pending work. </p>
11
<p>SLURM is not a sophisticated batch system, but it does provide an Applications
12
Programming Interface (API) for integration with external schedulers such as
5
<p>SLURM is an open-source resource manager designed for Linux clusters of
7
It provides three key functions.
8
First it allocates exclusive and/or non-exclusive access to resources
9
(computer nodes) to users for some duration of time so they can perform work.
10
Second, it provides a framework for starting, executing, and monitoring work
11
(typically a parallel job) on a set of allocated nodes.
12
Finally, it arbitrates contention for resources by managing a queue of
15
<p>SLURM's design is very modular with dozens of optional plugins.
16
In its simplest configuration, it can be installed and configured in a
17
couple of minutes (see <a href="http://www.linux-mag.com/id/7239/1/">
18
Caos NSA and Perceus: All-in-one Cluster Software Stack</a>
19
by Jeffrey B. Layton).
20
More complex configurations rely upon a
21
<a href="http://www.mysql.com/">MySQL</a> database for archiving
22
<a href="accounting.html">accounting</a> records, managing
23
<a href="resource_limits.html">resource limits</a> by user or bank account,
24
or supporting sophisticated
25
<a href="priority_multifactor.html">job prioritization</a> algorithms.
26
SLURM also provides an Applications Programming Interface (API) for
27
integration with external schedulers such as
13
28
<a href="http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php">
14
The Maui Scheduler</a> and
29
The Maui Scheduler</a> or
15
30
<a href="http://www.clusterresources.com/pages/products/moab-cluster-suite.php">
16
Moab Cluster Suite</a>.
17
While other resource managers do exist, SLURM is unique in several respects:
31
Moab Cluster Suite</a>.</p>
33
<p>While other resource managers do exist, SLURM is unique in several
19
36
<li>Its source code is freely available under the
20
37
<a href="http://www.gnu.org/licenses/gpl.html">GNU General Public License</a>.</li>
21
<li>It is designed to operate in a heterogeneous cluster with up to 65,536 nodes.</li>
22
<li>It is portable; written in C with a GNU autoconf configuration engine. While
23
initially written for Linux, other UNIX-like operating systems should be easy
24
porting targets. A plugin mechanism exists to support various interconnects, authentication
25
mechanisms, schedulers, etc.</li>
38
<li>It is designed to operate in a heterogeneous cluster with up to 65,536 nodes
39
and hundreds of thousands of processors.</li>
40
<li>It is portable; written in C with a GNU autoconf configuration engine.
41
While initially written for Linux, other UNIX-like operating systems should
42
be easy porting targets.</li>
26
43
<li>SLURM is highly tolerant of system failures, including failure of the node
27
44
executing its control functions.</li>
28
<li>It is simple enough for the motivated end user to understand its source and
29
add functionality.</li>
45
<li>A plugin mechanism exists to support various interconnects, authentication
46
mechanisms, schedulers, etc. These plugins are documented and simple enough for the motivated end user to understand the source and add functionality.</li>
32
49
<p>SLURM provides resource management on about 1000 computers worldwide,
35
52
<li><a href="https://asc.llnl.gov/computing_resources/bluegenel/">BlueGene/L</a>
36
53
at LLNL with 106,496 dual-core processors</li>
37
54
<li><a href="http://c-r-labs.com/">EKA</a> at Computational Research Laboratories,
38
India with 14,240 Xeon processoers and Infiniband interconnect</li>
55
India with 14,240 Xeon processors and Infiniband interconnect</li>
39
56
<li><a href="https://asc.llnl.gov/computing_resources/purple/">ASC Purple</a>
40
57
an IBM SP/AIX cluster at LLNL with 12,208 Power5 processors and a Federation switch</li>
41
58
<li><a href="http://www.bsc.es/plantillaA.php?cat_id=5">MareNostrum</a>
42
59
a Linux cluster at Barcelona Supercomputer Center
43
60
with 10,240 PowerPC processors and a Myrinet switch</li>
61
<li><a href="http://en.wikipedia.org/wiki/Anton_(computer)">Anton</a>
62
a massively parallel supercomputer designed and built by
63
<a href="http://www.deshawresearch.com/">D. E. Shaw Research</a>
64
for molecular dynamics simulation using 512 custom-designed ASICs
65
and a three-dimensional torus interconnect </li>
45
67
<p>SLURM is actively being developed, distributed and supported by
46
68
<a href="https://www.llnl.gov">Lawrence Livermore National Laboratory</a>,
47
<a href="http://www.hp.com">Hewlett-Packard</a>,
48
<a href="http://www.bull.com">Bull</a>,
49
<a href="http://www.clusterresources.com">Cluster Resources</a> and
50
<a href="http://www.sicortex.com">SiCortex</a>.</p>
69
<a href="http://www.hp.com">Hewlett-Packard</a> and
70
<a href="http://www.bull.com">Bull</a>.
71
It is also distributed and supported by
72
<a href="http://www.clusterresources.com">Cluster Resources</a>,
73
<a href="http://www.sicortex.com">SiCortex</a>,
74
<a href="http://www.infiscale.com">Infiscale</a>,
75
<a href="http://www.ibm.com">IBM</a> and
76
<a href="http://www.sun.com">Sun Microsystems</a>.</p>
52
<p style="text-align:center;">Last modified 29 November 2007</p>
78
<p style="text-align:center;">Last modified 25 March 2009</p>
54
80
<!--#include virtual="footer.txt"-->