...
Also, Globus is available, and is particularly useful for the transfer of large files. Here's a link to our wiki page, How-to: Use Globus
Storing your Data
As a Talapas user, you have these directories available for storing and manipulating your files.
...
Code Block | ||||
---|---|---|---|---|
| ||||
#!/bin/bash #SBATCH --account=<myaccount> ### change this to your actual account for charging #SBATCH --partition=short compute ### queue to submit to #SBATCH --job-name=myhostjob ### job name #SBATCH --output=hostname.out ### file in which to store job stdout #SBATCH --error=hostname.err ### file in which to store job stderr #SBATCH --time=5 ### wall-clock time limit, in minutes #SBATCH --mem=100M ### memory limit per node, in MB #SBATCH --nodes=1 ### number of nodes to use #SBATCH --ntasks-per-node=1 ### number of tasks to launch per node #SBATCH --cpus-per-task=1 ### number of cores for each task hostname |
...
Code Block | ||
---|---|---|
| ||
$ sbatch hostname.batch Submitted batch job 12345 # checking with squeue, we see that the job has started running (state is 'R') $ squeue -j 12345 JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 12345 shortcompute myhostjob joec R 5:08 1 n003 # later, we see that the job has successfully completed $ sacct -j 12345 JobID JobName Partition Account AllocCPUS State ExitCode ------------ ---------- ---------- ---------- ---------- ---------- -------- 12345 myhostjob+ compute short joegrp 1 COMPLETED 0:0 12345.batch batch joegrp 1 COMPLETED 0:0 12345.exte+ extern joegrp 1 COMPLETED 0:0 |
...
SLURM uses a number of queues, which it calls partitions, to run jobs with different properties. Normal jobs will use the short compute partition, but there are other partitions for jobs that need to run longer than a day, need more memory, or need to use GPUs, for example. You can use the sinfo
command to see detailed information about the partitions and their states.
...
Code Block | ||
---|---|---|
| ||
srun --account=hpcrcf --pty --partition=shortcompute --mem=1024M --time=240 bash |
Most of the flags work just as in batch scripts, above. This command requests a single core (the default) on a node in the short compute partition, with 1024MB of memory, for four hours. (After four hours, the shell and all commands running under it will be killed.)
...
Code Block | ||
---|---|---|
| ||
srun --account=hpcrcf --x11 --pty --partition=shortcompute --mem=4G --time=4:00 bash |
...
Code Block | ||
---|---|---|
| ||
xrun --account=hpcrcf --partition=shortcompute --mem=1024 --time=4 bash |
Note that this script supports only a subset of the srun
flags and has some quirks. The --mem
value must be a whole number and is in megabytes. The --time
value is in HH:MM format, but if only a whole number is specified, it is take as hours (rather than minutes).
...
For time, the default varies by partition, but is generally the maximum available for the partition. For the short compute partition, the default is currently 24 hours. For the long computelong partition, it's currently two weeks. If your job will take significantly less time, you can specify a shorter duration, to increase the odds that it will be scheduled sooner.
...