Slurm Job Status Reference Sheet

Below is a reference sheet for slurm job statuses, their meaning, and common causes.

Code

Status

Description

Common Cause or Example

PD

Pending

The job is awaiting resource allocation and has not yet started execution.

Waiting for resources, partition limits, or job dependencies.

R

Running

The job currently has an allocation and is actively executing on the assigned nodes.

Job is actively running code or computations.

CD

Completed

The job has successfully terminated all processes on all allocated nodes and completed its execution.

Normal completion — program exited cleanly with code 0.

F

Failed

The job terminated with a non-zero exit code or encountered another failure condition during execution.

Script error, segmentation fault, or crash.

CG

Completing

The job is in the process of completing, but some processes on some nodes may still be active.

Cleanup in progress after job termination.

CA

Cancelled

The job was explicitly cancelled by the user or a system administrator.

User ran scancel <jobid> or dependency job failed.

CF

Configuring

The job has been allocated resources, but is waiting for them to become ready for use (e.g., booting up).

Nodes are powering on or network initialization is pending.

S

Suspended

The job has an allocation, but its execution has been temporarily suspended.

Admin intervention, system load balancing, or checkpoint/restart testing.

ST

Stopped

The job has been stopped, but its cores are retained, unlike a suspended job which releases its cores.

Manually paused for debugging or maintenance.

TO

Timeout

The job reached its allocated time limit and was terminated.

Walltime (#SBATCH -t) exceeded before job finished.

OOM

Out of Memory

The job terminated due to an out-of-memory error.

Process exceeded node memory; increase --mem or optimize code.

NF

Node Fail

The job terminated due to the failure of one or more allocated nodes.

Hardware failure, network outage, or node reboot.

PR

Preempted

The job was terminated by another job, typically due to a higher priority or resource requirement.

Higher-priority user or system reservation took precedence.