HPC Vocabulary

Using High Performance Computing (HPC) resources involves learning the language commonly used to describe experiment setups, resources, cluster configuration, etc. Here are some commonly used terms.

Experiment Terms

Job

A single workload submitted to the scheduling system. For example, running a script or program on a node.

Parallelization Terms

MPI

MPI stands for Message Passing Interface, a standard that describes how information should be passed between nodes and tasks. MPI is a standard, not an implementation. Some popular implementations include OpenMPI, MPICH, and MVAPICH.

Resource Terms

Cluster

A group of computers (called nodes) linked together with an internal network (called an interconnect).

Condo

A method of node purchasing and setup where a researcher purchases their own compute hardware, subject to approval from the cluster operator, and the cluster operator installs and manages it. Condo nodes are "exclusive access" to the purchaser and are generally available only to their group.

Interconnect

The internal network that allows cluster nodes to communicate with one another.

Node

one computer composing an HPC cluster. Jobs run on HPC clusters can often use more than one node.

Job Scheduler

manages access to the computing resources on the cluster. UT Dallas largely uses Slurm.

Partition

groups of nodes with imposed constraints (e.g., allowed users).

Processor

sometimes referred to as a core. Cores can execute instructions independently, which enables parallel programs.

GPU

Graphics Processing Unit. Dedicated hardware for highly parallel processing.

Queue

the list of jobs currently running or waiting for resources for a particular partition.