Here are the basics of using the Talapas HPC cluster.
Logging on to Talapas
Your username and password on Talapas will be your duckID and its associated password (i.e. the same that you use for your UO email). To manage your password use the official Duck ID. (That is, if your email address is alice@uoregon.edu, your Talapas username will be "alice".) Your password is the same university-wide, and can be managed at the UO password reset page.
...
Talapas has two login nodes which can be reached directly at :
talapas-ln1.uoregon.edu
...
talapas-ln2.uoregon.edu
...
These hosts are entirely equivalent.
Job Submission
Users are not allowed to run applications and simulations on the login nodes of Talapas and You can use whichever seems less busy.
The login nodes are for light tasks needed to set up and submit your work. They're not for running applications, simulations, etc., and doing so will result in loss of privileges. Jobs must be submitted to the scheduler, SLURM, or run via interactive sessions Run those heavier tasks on the compute nodes by using SLURM.
Job Submission with SLURM
Talapas uses The SLURM as its job scheduler and resource manager provides a way to submit large computational tasks to the cluster in a batch fashion. You can also use it to get an interactive shell on a compute node suitable for interactive tasks not suitable for the login nodes.
Batch Job Submission
To run a job on Talapas you must first create a SLURM job script describing the resources your job requires and the executables to be run. You then submit your job script to the scheduler using the sbatch command. If the necessary resources are currently available, your job will run immediately. If not, your job will be placed in the job queue and will be run when the necessary resources become available. To check on the status of your job, use the squeue command. To cancel a job you've submitted, use the scancel command. To list the partitions on the cluster and see their status use the sinfo command.
Interactive Shells
To start an interactive shell on Talapas, use the srun command. In most regards, this works the same as a batch job submission, but will open a shell connection to the selected compute node (as with ssh), and you can then execute commands, etc.
SLURM Partitions
Partitions are essentially "queues" that provide different resource sets to their jobs. There are partitions for "short" and "long" jobs, partitions that provide GPU nodes or not, and partitions that provide large RAM nodes.
Talapas is run on a dual club/condo model. Members of the compute club will have access to all the University-owned compute resources while condo users will have access to the condo partition corresponding the resources they have purchase purchased. (note Note that users may be members of both the condo and compute club). For a list of partitions and which PIRGs (Principal Investigator Research Groups) have access to them see the Partition List.
Storage
Storage space on Talapas is made available via the Talapas storage club and can be purchased by a PIRG. Storage is accounted for according to the group ownership of the file and it's important that ownership is correctly attributed.
Except for local scratch space, all directories are on shared GPFS storage, visible on all cluster hosts.
Home Space
Each user on Talapas is assigned a private personal home directory located at /home/
<duckID><username>
. This is limited to 10GB. They also have an individual PIRG directory at /home/<PIRG>/
<duckID> <username>
for larger datasets, etc.
By default, permissions on both are set to drwx------
, i.e., user only.
Project Space
Each PIRG on the system has a shared project space located at /projects/<PIRG>/shared
. By default, permissions are set to drwxrws---
, i.e., group permissions.
Local Scratch
Each compute node has a local scratch disk. The size of the local disk depends on the type of compute node and can be found on the Machine Specifications page. Please have your jobs remove their scratch files when they finish.
Software
Talapas uses the LMOD Lmod environment module software to control the linux Linux environment variables and provide multiple software versions. Users can run the module spider command to search for particular software packages on the system. The module avail command will show a list of packages whose dependencies are currently loaded. Use module load to add a software package to your environment and module unload to remove it.
Related articles
Filter by label (Content by label) | ||
---|---|---|
|
...