|
Table of contents:
- How do I run jobs under SLURM?
- Doe Open MPI support "srun -n X my_mpi_application"?
- I use SLURM on a cluster with the OpenFabrics network stack. Do I need to do anything special?
| 1. How do I run jobs under SLURM? |
The short answer is you can use mpirun as normal, or directly launch
your application using srun.
The longer answer is that Open MPI supports launching parallel jobs in
all three methods that SLURM supports:
- Launching via "
salloc ...": supported (older versions of SLURM used "srun -A ...")
- Launching via "
sbatch ...": supported (older versions of SLURM used "srun -B ...")
- Launching via "
srun -n X my_mpi_application"
Specifically, you can launch Open MPI's mpirun in an interactive
SLURM allocation (via the salloc command) or you can submit a
script to SLURM (via the sbatch command), or you can "directly"
launch MPI executables via srun.
Open MPI automatically obtains both the list of hosts and how many
processes to start on each host from SLURM directly. Hence, it is
unnecessary to specify the --hostfile, --host, or -np options to
mpirun. Open MPI will also use SLURM-native mechanisms to launch
and kill processes ([rsh] and/or ssh are not required).
For example:
# Allocate a SLURM job with 4 nodes
shell$ salloc -N 4 sh
# Now run an Open MPI job on all the nodes allocated by SLURM
# (Note that you need to specify -np for the 1.0 and 1.1 series;
# the -np value is inferred directly from SLURM starting with the
# v1.2 series)
shell$ mpirun my_mpi_application
|
This will run the 4 MPI processes on the nodes that were allocated by
SLURM. Equivalently, you can do this:
# Allocate a SLURM job with 4 nodes and run your MPI application in it
shell$ salloc -N 4 mpirun my_mpi_aplication
|
Or, if submitting a script:
shell$ cat my_script.sh
#!/bin/sh
mpirun my_mpi_application
shell$ sbatch -N 4 my_script.sh
srun: jobid 1234 submitted
shell$
|
| 2. Doe Open MPI support "srun -n X my_mpi_application"? |
Yes
| 3. I use SLURM on a cluster with the OpenFabrics network stack. Do I need to do anything special? |
Yes. You need to ensure that SLURM sets up the locked memory
limits properly. Be sure to see this FAQ entry about
locked memory and this FAQ entry for
references about SLURM.
|