Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Here are the basics of using the Talapas HPC cluster.

Logging on to Talapas

Your username on Talapas will be your Duck ID.  (That is, if your email address is alice@uoregon.edu, your Talapas username will be "alice".)  Your password is the same university-wide, and can be managed at the UO password reset page.

Talapas has two login nodes:

  • talapas-ln1.uoregon.edu
  • talapas-ln2.uoregon.edu

These hosts are entirely equivalent.  You can use whichever seems less busy.

The login nodes are for light tasks needed to set up and submit your work.  They're not for running applications, simulations, etc., and doing so will result in loss of privileges.  Run those heavier tasks on the compute nodes by using SLURM.

Job Submission with SLURM

The SLURM job scheduler and resource manager provides a way to submit large computational tasks to the cluster in a batch fashion.  You can also use it to get an interactive shell on a compute node suitable for interactive tasks not suitable for the login nodes.

Batch Job Submission

To run a job on Talapas you must first create a SLURM job script describing the resources your job requires and the executables to be run. You then submit your job script to the scheduler using the sbatch command. If the necessary resources are currently available, your job will run immediately. If not, your job will be placed in the job queue and will be run when the necessary resources become available. To check on the status of your job, use the squeue command. To cancel a job you've submitted, use the scancel command. To list the partitions on the cluster and see their status use the sinfo command.

Interactive Shells

To start an interactive shell on Talapas, use the srun command.  In most regards, this works the same as a batch job submission, but will open a shell connection to the selected compute node (as with ssh), and you can then execute commands, etc.

SLURM Partitions

Partitions are essentially "queues" that provide different resource sets to their jobs.  There are partitions for "short" and "long" jobs, partitions that provide GPU nodes or not, and partitions that provide large RAM nodes.

Talapas is run on a dual club/condo model.  Members of the compute club have access to all University-owned compute resources while condo users have access to the condo partition corresponding the resources they have purchased.  (Note that users may be members of both the condo and compute club).  For a list of partitions and which PIRGs (Principal Investigator Research Groups) have access to them see the Partition List.

Storage

Storage space on Talapas is made available via the Talapas storage club and can be purchased by a PIRG.  Storage is accounted for according to the group ownership of the file and it's important that ownership is correctly attributed.

Except for local scratch space, all directories are on shared GPFS storage, visible on all cluster hosts.

Home Space

Each user on Talapas is assigned a personal home directory located at /home/<username>.  This is limited to 10GB.  They also have an individual PIRG directory at /home/<PIRG>/<username> for larger datasets, etc.

By default, permissions on both are set to drwx------, i.e., user only.

Project Space

Each PIRG on the system has a shared project space located at /projects/<PIRG>/shared.  By default, permissions are set to drwxrws---, i.e., group permissions.

Local Scratch

Each compute node has a local scratch disk. The size of the local disk depends on the type of compute node and can be found on the Machine Specifications page.  Please have your jobs remove their scratch files when they finish.

Software

Talapas uses the Lmod environment module software to control the Linux environment variables and provide multiple software versions.  Users can run the module spider command to search for particular software packages on the system.  The module avail command will show a list of packages whose dependencies are currently loaded.  Use module load to add a software package to your environment and module unload to remove it.

Related articles

  • No labels