Skip to main content

SLURM Jobs

Scheduling Resources with Slurm

The HPC system is a shared community resource used by many users, including students, faculty, and staff. As such, it uses a resource scheduling program called Slurm to ensure fair access to computing resources among all users. Slurm allows users to request resources for both interactive and batch-mode computing, meaning either an interactive session or a series of commands that are automatically executed when the requested resources become available and are allocated.

AsideSlurm, short for “Simple Linux Utility for Resource Management”, is an open-source job scheduler developed for High Performance Computing (HPC) systems. Slurm is the workload manager for approximately 60% of the top 500 supercomputers. More information about Slurm is available in the Additional Slurm Documentation section at the end of this page.

Login Node versus Compute Nodes

When you access the HPC, you typically first log in to a front-end login node. From this front-end server, you can configure your environment, write code, compile programs, and test software on small data sets. However, the login server is a shared community resource and should not be used for large computational tasks. Instead, the login node serves as your gateway to the large compute resources available on the back-end compute nodes.

used in slum job tutorial and technical specifications

warning

To access the back-end compute nodes, a user will use Slurm to either request interactive resources or schedule jobs to be executed by Slurm in batch mode when the resources become available. For large computational tasks, batch processing is preferred.

Alternatively, a user can access the HPC through our JupyterHub Interface. JupyterHub will allocate a session for the user on a compute node. By default, that session will be in the “hub” partition, which consists of nodes tailored to the needs of a typical JupyterHub user. JupyterHub also provides limited access to other computing resources in the back-end compute nodes.