Partition Details
|
Partition |
Nodes |
#Nodes |
Total CPUs |
Max Time |
|
all |
cpu[01-04], gpu[01-04], mem[01-02] |
10 |
896 |
1‑00:00:00 |
|
cmp (default) |
cpu[01-04], mem[01-02] |
6 |
576 |
4‑04:00:00 |
|
cpu |
cpu[01-04] |
4 |
384 |
4‑04:00:00 |
|
gpu |
gpu[01-04] |
4 |
320 |
2‑00:00:00 |
|
mem |
mem[01-02] |
2 |
192 |
4‑04:00:00 |
|
amd |
amd01 |
1 |
256 |
2‑00:00:00 |
|
hub |
hub[01-09] |
9 |
72 |
1‑00:00:00 |
|
condo |
bio[01-03], dmlab01-02, oignat01, rgrotjahn[01-04], ycho[201-202] |
9 |
1872 |
2‑00:00:00 |
|
dmlab |
dmlab[01-02] |
2 |
208 |
7‑00:00:00 |
|
bio |
bio[01-03] |
3 |
384 |
7‑00:00:00 |
|
oignat_lab |
oignat01 |
1 |
384 |
7‑00:00:00 |
|
rgrotjahn_lab |
rgrotjahn[01-04] |
4 |
1024 |
7‑00:00:00 |
|
ycho2_lab |
ycho[201-202] |
2 |
768 |
7‑00:00:00 |
Partition Attributes
|
Partition |
AllowGroups |
Default? |
QoS |
DefaultTime |
PreemptMode |
|
all |
ALL |
NO |
01:00:00 |
REQUEUE |
|
|
cmp (default) |
ALL |
YES |
01:00:00 |
REQUEUE |
|
|
cpu |
ALL |
NO |
01:00:00 |
REQUEUE |
|
|
gpu |
ALL |
NO |
01:00:00 |
REQUEUE |
|
|
mem |
ALL |
NO |
01:00:00 |
REQUEUE |
|
|
amd |
ALL |
NO |
01:00:00 |
REQUEUE |
|
|
hub |
ALL |
NO |
01:00:00 |
REQUEUE |
|
|
condo |
ALL |
NO |
condo_guest |
01:00:00 |
REQUEUE |
|
dmlab |
dmlab-nodes |
NO |
condo_owner |
01:00:00 |
REQUEUE |
|
bio |
memccully_lab, biol172 |
NO |
condo_owner |
01:00:00 |
REQUEUE |
|
oignat_lab |
oignat_lab |
NO |
condo_owner |
01:00:00 |
REQUEUE |
|
rgrotjahn_lab |
rgrotjahn_lab |
NO |
condo_owner |
01:00:00 |
REQUEUE |
|
ycho2_lab |
ycho2_lab |
NO |
condo_owner |
01:00:00 |
REQUEUE |
Partition: The name of the queue or logical grouping of nodes. Specify with–partition(or-p) insbatch/srun.Nodes: A list or range of compute node hostnames in that partition (e.g. gpu[01-04] expands to gpu01 gpu02 gpu03 gpu04).#Nodes: The total count of individual nodes covered by the Nodes field.Max Time: The maximum wall‑clock time a job may run in this partition. Slurm will cap jobs beyond this limit.AllowGroups: Unix group(s) permitted to submit jobs here. ALL means no restriction; otherwise list specific group(s).Default?: YES if this is the default partition when none is specified; NO otherwise.QoS: The Slurm Quality‑of‑Service level applied to jobs in this partition (affects priority and fairshare).DefaultTime: The time Slurm applies if the user omits –time (prevents runaway jobs; typically 01:00:00).PreemptMode: What Slurm does when higher‑priority work needs these nodes. REQUEUE returns preempted jobs to the queue.
There are five queues, or partitions of nodes, that can be used for submitting jobs.
- The
cmppartition is the default partition if none is specified and contains all the cpu and mem nodes (i.e. all nodes without a GPU processor). This partition should be used for all compute intensive tasks that do not require a GPU processor.
- The
gpupartition contains all GPU programming capable nodes (384GB RAM, 2x NVIDIA V100 GPUs) and should be used if you require a GPU processor. Note you will also need to explicitly request a GPU processor as explained in Requesting GPU Resources.
- The
mempartition contains all high-memory nodes (2TB RAM, no GPU) and should be used if you require the extra memory.
- The
cpupartition contains a set of compute related nodes (512GB RAM, no GPU).
- The
amdpartition contains a single (256GB RAM, 8x MI100 GPU) node with a GPU that has AMD's CDNA architecture, as opposed to NVIDIA's CUDA.
- The
allpartition is comprised of all the nodes in the cpu, gpu, and mem partitions. It is there to allow users to request parallelism across the breadth of the cluster. Note that the all partition has a lower time limit to restrict the ability to schedule all the nodes for an extend period of time.
- The
condopartition is composed of a heterogeneous set of nodes purchased by individual faculty members. You are welcome to run jobs on this partition when the nodes are available, but be aware that your job may be terminated if someone from the faculty member's lab requests the resources.
It is important to specify which partitions you need and your necessary resources as it will provide you with the best scheduling availability for your task.
The Slurm sinfo command provides information about the nodes and partitions. For example, if you were able to execute sinfo on an idle cluster you might see the following:
$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
all up 1-00:00:00 10 idle cpu[01-04],gpu[01-04],mem[01-02]
amd up 2-00:00:00 1 idle amd01
cmp* up 4-04:00:00 6 idle cpu[01-04],mem[01-02]
cpu up 4-04:00:00 4 idle cpu[01-04]
gpu up 2-00:00:00 4 idle gpu[01-04]
mem up 4-04:00:00 2 idle mem[01-02]
condo up 2-00:00:00 9 idle bio[01-03],dmlab01,oignat01,rgrotjahn[01-04]
Note that each queue or partition has different limits on both the number of nodes and the maximum time they can be reserved for. See for more information.
A user can specify the partition using the –partition parameter in either the srun or sbatch commands.
Use scontrol to see the details on individual nodes, which is especially useful for the heterogeneous condo queue. The CfgTRES line will tell you the number of CPUs and GPUs and amount of RAM available on the whole node. For example, gpu01 has 80 CPUs, 368 GB of RAM, and 2 GPUs.
$ scontrol show node gpu01
NodeName=gpu01 Arch=x86_64 CoresPerSocket=20
CPUAlloc=32 CPUEfctv=80 CPUTot=80 CPULoad=2.47
AvailableFeatures=volta,v100
ActiveFeatures=volta,v100
Gres=gpu:volta:2
NodeAddr=gpu01 NodeHostName=gpu01 Version=24.05.2
OS=Linux 4.18.0-553.16.1.el8_10.x86_64 #1 SMP Thu Aug 8 17:47:08 UTC 2024
RealMemory=376800 AllocMem=131072 FreeMem=284879 Sockets=2 Boards=1
State=MIXED ThreadsPerCore=2 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
Partitions=all,gpu
BootTime=2025-06-24T10:17:48 SlurmdStartTime=2025-06-24T10:38:07
LastBusyTime=2025-08-20T00:33:09 ResumeAfterTime=None
CfgTRES=cpu=80,mem=376800M,billing=80,gres/gpu=2
AllocTRES=cpu=32,mem=128G,gres/gpu=2
CurrentWatts=0 AveWatts=0
Where as the following will schedule a job for 30 minutes of processing.
srun ... --time 2-00:00:00 ---
Where as the following will schedule a job for 30 minutes of processing.
srun ... --time 00:30:00 ---
There is a maximum time for which a given partition of nodes can be reserved. This varies based on the partition. The sinfo command will display the maximum possible time allowed per partition. If you try to schedule a job for more than the maximum time allowed, the scheduler will override your request.
|
Partition |
Max Wall‑Time (d-hh:mm:ss) |
|
all |
1‑00:00:00 |
|
cmp |
4‑04:00:00 |
|
cpu |
4‑04:00:00 |
|
gpu |
2‑00:00:00 |
|
mem |
4‑04:00:00 |
|
amd |
2‑00:00:00 |
|
hub |
1‑00:00:00 |
|
condo |
2‑00:00:00 |
|
dmlab |
7‑00:00:00 |
|
bio |
7‑00:00:00 |
|
oignat_lab |
7‑00:00:00 |
|
rgrotjahn_lab |
7‑00:00:00 |
|
ycho2_lab |
7‑00:00:00 |