Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
stylenone

General Principles

Jobs that run on multiple nodes generally use a parallel programming API called MPI (Message Passing Interface), which allows processes on multiple nodes to communicate with high throughput and low latency (especially over Talapas' InfiniBand network).  MPI is a standard and has multiple implementations—several are available on Talapas, notably Intel MPI and Open MPI and Intel MPI The choice between these is largely a matter of personal taste and the specific needs of the situation.

The SLURM scheduler has built-in support for MPI jobs.  Jobs can be run in a generic way, or if needed, you can use extra parameters to carefully control how MPI processes are mapped to the hardware.

...

Most parts of job setup are the same for all MPI flavors.  Notably, you'll want to decide how many simultaneous tasks (processes) you want to run your job across.

...

Code Block
languagetext
#SBATCH --mem-per-cpu=12G8G

This is strictly only needed if the job will require more than the default amount of RAM, but it's always a good idea.

...

  1. The preferred way is to invoke it directly with the srun command within your sbatch script.  This provides a few additional features and is arguably a bit simpler.

  2. The alternative is to invoke it using the mpirun/mpiexec program within your sbatch script. 

...

Code Block
languagebash
#!/bin/bash
#SBATCH --account=racs
#SBATCH --partition=compute
#SBATCH --job-name=intel-mpi
#SBATCH --output=intel-mpi.out
#SBATCH --error=intel-mpi.err
#SBATCH --nodes=3
#SBATCH --ntasks-per-node=28
#SBATCH --ntasks-per-core=1
module load intel-oneapi-mpi/2021.9.0
srun ./helloworld_mpi.x

Open MPI

To access Open MPI,

Code Block
module load openmpi/4.1.1 or module load openmpi/4.1.5

Starting an Open MPI job directly withsrunis not supported.  Doing so might not produce an obvious error, but in some cases will simultaneously start many independent single-process jobs instead of a single MPI job with all processes working together.  At best this will be very slow, and at worst the output may be incorrect. For more information see man mpirun.

...