Submitting workloads

The High Performance Computing (HPC) resources at UT Dallas use the Slurm Workload Manager to facilitate access to the compute nodes from the login nodes. If multiple users or jobs need to access the same resources, Slurm also manages queues of pending work to submit to the resources as they become available. You can request resources from Slurm in two ways: interactively and through a batch script.

Interactive jobs

When you run a job interactively with srun --pty /bin/bash or salloc -p normal, you’re provided with terminal access to the compute node after the requested resources are allocated. Before requesting an interactive session, you need to know:

  1. The partition the resources you need to access are in (default: normal on Ganymede). (denoted -p or --partition)

  2. The number of nodes you need. (-N or --nodes)

  3. Either the total number of tasks needed, or the number of tasks per node. (-n or --ntasks for total tasks and --ntasks-per-node for the number of tasks per node)

  4. (Optional) the number of CPUs needed per task. (--cpus-per-task)

Interactive job examples

For example, the following command requests one node and one task on the normal partition, a setup suitable for a serial job:

srun -N 1 -n 1 -p normal --pty /bin/bash
or
salloc -N 1 -n 1 -p normal --time=00:30:00

Similarly, the following command requests two nodes and thirty two tasks (equivalently, sixteen tasks per node) on the normal partition, which would be appropriate for an MPI parallelized job:

srun -N 2 -n 32 -p normal --pty /bin/bash
or
salloc -N 2 -n 32 -p normal --time=00:30:00

Finally, the following command requests one node, one task, and 16 CPUs per task, suitable for an OpenMP parallelized job:

srun -N 1 -n 1 --cpus-per-task 16 --pty /bin/bash
or
salloc -N 1 -n 1 -p normal --cpus-per-task 16 --time=00:30:00

Launching work in an interactive job

Once your resources are allocated in an interactive job, you have terminal access to the compute nodes requested. From there, you can run your workload application or scripts in a manner suitable to your parallelization method.

Slurm batch scripts

Often, it’s preferable to submit your job to the compute nodes non-interactively. By using the batch script method, a job is queued until resources become available and then run with no further input from you. Submitting to the batch system requires writing a batch script composed of Slurm settings and your workload commands.

Slurm specifications in a batch script

At the beginning of your batch script, you can specify Slurm batch settings by prefixing each setting with #SBATCH. At minimum, you need to prepare:

  1. The partition to request resources from (default: normal on Ganymede). For example: #SBATCH --partition=normal

  2. The number of nodes your workload requires. Example: #SBATCH --nodes=2

  3. The total number of tasks required. Example: #SBATCH --ntasks=32. Alternatively, you can specify the number of tasks per node with #SBATCH --ntasks-per-node=16

  4. The maximum time required for your workload in the format Days-Hours:Minutes:Seconds. For Example: --time=1-12:00:00 provides a maximum run time of one day and twelve hours.

For a full list of available settings, see the sbatch documentation.

Slurm batch script example

For example, the following batch script runs a python script parallelized with mpi4py on two nodes with a total of thirty two tasks with a maximum runtime of one hour:

#!/bin/bash

#SBATCH     --partition=normal
#SBATCH     --nodes=2
#SBATCH     --ntasks=32
#SBATCH     --time=01:00:00
#SBATCH     --mail-type=ALL
#SBATCH     --mail-user=your.email@utdallas.edu

prun python my_scripy.py

For more information on prun and parallelization techniques, see the parallelization documentation.

FAQ

The following are some frequently asked questions regarding the use of Slurm.

My job shows Pending (PD) even though free nodes are available

There are various reasons this can happen. The following shows some common issues.

Problem Description Fix

The time limit for your job conflicts with a reservation

You can either modify your job to specify the total run time or wait until after the reservation window. To see existing reservations, you can run scontrol show reservations. You will also notice that the REASON given in squeue output will be ReqNodeNotAvail, Reserved for maintenance.