The new HPC cluster has the same name, Talapas (pronounced TA-la-pas) but with newer hardware and operating system. Although some things have changed, most changes are for the better, and most software should continue to “just work”.

Notable updates

Login

Duckids

Talapas uses UO Identity Access Management system, Microsoft Active Directory, for authentication which requires all users to have a valid UO Duckid.

Links are provided below for external collaborators or graduating researchers to continue their access to the cluster.

Talapas VPN

A virtual private network (VPN) connection is required to access the cluster. This adds an extra layer of security. The Talapas VPN should provide all the same capabilities as UO VPN as well as adding access to Talapas.

Instructions here: Article - Getting Started with UO VPN (uoregon.edu)

Use “uovpn.uoregon.edu/talapas" as the connection URL and your duckid and password.

Advanced users might want to use OpenConnect, OpenConnect VPN client. This would support connection using a command such as,

sudo openconnect --protocol=anyconnect uovpn.uoregon.edu/talapas

Note: do not repeatedly attempt to log in when you’re getting error messages. As with other uses of your DuckID at UO, if you generate a large number of login failures, all DuckID access (including things like e-mail) will be locked University-wide. Similarly, be aware of automated processes like cron jobs that might trigger this situation without your notice.

Blocked ports

Note that inbound access to the new cluster is only allowed for SSH and Open OnDemand (coming soon). All other ports are blocked.

Load balancer

The preferred method of access is via login.talapas.uoregon.edu

The new cluster has 3 login nodes. A load balancer is used to redirect SSH connections to different login nodes to spread the load. The load balancers choice of login node is “sticky” in that repeated connections from your IP address will go to the same login node - as long as there has been some activity within the last 24 hours.

Slurm

Job control

Partitions

GPU

3 GPU memory sizes are available, 10GB, 40GB, 80GB. Specify the GPU size with --constraint, for example: --constraint=gpu-10gb

CUDA A100 MIG slicing

Due to limitations with CUDA MIG slicing, it appears that a job can only use one slice (GPU) per host. That means one GPU per job unless MPI is being used to orchestrate GPU usage on multiple hosts. See NVIDIA Multi-Instance GPU User Guide :: NVIDIA Tesla Documentation.

Coming soon

Technical Differences

These probably won’t affect you, but they are visible differences that you might notice.