If you are new to cluster computing, you may be wondering why you can't simply run your applications in the same way that you do on your laptop or desktop. First, as a shared resource we need a method of coordinating the work of many researchers simultaneously so that users aren't stepping on each others' toes.
Second, when you first log into Talapas you will be connected to one of our "login nodes". These nodes are essentially a lobby in which users can do file management, write scripts, and submit jobs. They are not an appropriate place to run applications or conduct simulations. A good rule of thumb is: If it takes more than one second to complete, it's not appropriate for a login node.
Instead, these tasks should be conducted on a "compute node." These are purpose-built for running intensive computations and can only be accessed via the SLURM job scheduler. SLURM will ensure that the compute nodes are allocated in a fair and equitable manner that prevents resource conflicts. The primary method by which you will run simulations on Talapas will be to "submit a job."
Step-by-step guide
Create a job script. A job script is a description of the computational resources your job requires and the executables you wish to run. Lets look at a "hello world" example of a job script:
hello.srun#!/bin/bash #SBATCH --partition=long ### Partition (like a queue in PBS) #SBATCH --job-name=HiWorld ### Job Name #SBATCH --output=Hi.out ### File in which to store job output #SBATCH --error=Hi.err ### File in which to store job error messages #SBATCH --time=0-00:01:00 ### Wall clock time limit in Days-HH:MM:SS #SBATCH --nodes=1 ### Number of nodes needed for the job #SBATCH --ntasks-per-node=1 ### Number of tasks to be launched per Node #SBATCH --account=<myPIRG> ### Account used for job submission ./a.out # run your actual program
Above we see the contents of our SLURM script (aka job script) called
hello.srun.
(The name is arbitrary–use whatever name you like.) Notice that the script begins with#!/bin/bash
. This line tells Linux which shell interpreter to use when executing the script. Here we usedbash
(the Bourne Again Shell) and it's by far the most common choice, but other interpreters could be used (e.g.,tcsh
,python
, etc.). Whatever your choice, every script should begin with interpreter directive.
Next, we see a collection of specially formatted comments, each beginning with#SBATCH
followed by option definitions. These are used by thesbatch
command to set job options. (As comments, they are ignored bybash
.) This allows us to describe our job to the scheduler and ensure that we reserve the appropriate resources (cores, memory, etc.) for an appropriate amount of time.
While the specified--time
needs to be long enough for the job to complete (lest it be killed when time runs out), it's also good not to needlessly overestimate the amount of time required in the provided--time
specification. Shorter jobs are more likely to run sooner, as they can fill in between longer jobs that aren't yet runnable.
Note that the script suffix we used is unimportant. You can name your job scripts whatever you wish.Submit your job to the scheduler using the
sbatch
command.[duckID@ln1 helloworld]$ sbatch hello.srun Submitted batch job 20190 [duckID@ln1 helloworld]$
Our job has been submitted and is assigned the job number 20190 which will serve as its primary identifier.
Check on your job using the
squeue
command.[duckID@ln1 helloworld]$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 20190 long HiWorld duckID CG 1:09 1 n074 20123_[1-35] long RSA_09_c user1 PD 0:00 1 (ReqNodeNotAvail, UnavailableNodes:hpc-hn2,ln[1-2],n[005,120,122]) 20017_[5-20] longgpu pressure user2 PD 0:00 1 (ReqNodeNotAvail, UnavailableNodes:hpc-hn2,ln[1-2],n[005,120,122]) 20017_4 longgpu pressure user2 R 1-03:53:46 1 n110 20017_3 longgpu pressure user2 R 1-06:34:31 1 n109 19468 longfat bash user3 R 11-21:16:00 1 n123 20017_2 longgpu pressure user2 R 1-21:25:54 1 n119 19995_20 longgpu pressure user2 R 3-10:28:49 1 n106 19995_19 longgpu pressure user2 R 3-19:37:26 1 n104 19995_3 longgpu pressure user2 R 4-05:01:45 1 n111 19995_4 longgpu pressure user2 R 4-05:01:45 1 n112 19995_5 longgpu pressure user2 R 4-05:01:45 1 n113 19995_11 longgpu pressure user2 R 4-05:01:45 1 n100 20017_0 longgpu pressure user2 R 2-03:41:35 1 n107 20017_1 longgpu pressure user2 R 2-03:41:35 1 n118 20189 fat build-R- user4 R 7:42 1 n121 20188 fat build-R- user4 R 13:00 1 n124 20177 defq make-sil user4 R 1:43:44 72 n[006-073,075-078] [duckID@ln1 helloworld]$
Here we see that our job, number 20190, is in the CG (completing state). Notice that other jobs in the system are in the R (running) or PD (pending state). Jobs are pending when there are insufficient resources available to accommodate the request as specified in the job script. In this case, the system was scheduled for maintenance, and the wall clock limit specified in those jobs would have allowed them to run into the maintenance period. The jobs will run once the maintenance is complete, and the reservation is removed from the system. To view only your jobs, use the option flag
-u
followed by your userID, e.g.squeue -u duckID
.If necessary, cancel your job using the
scancel
command followed by the job number of the job you wish to cancel.[duckID@ln1 helloworld]$ scancel 20190 [duckIDo@ln1 helloworld]$ squeue -u cmaggio JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) [duckID@ln1 helloworld]$
You are now ready to submit jobs on the Talapas cluster. For instructions on more advanced topics like parallel jobs, job arrays, and interactive jobs, see the other "How-to" guides on this site.
Related articles
Filter by label
There are no items with the selected labels at this time.