Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
minLevel1
maxLevel6
outlinefalse
typelist
printablefalse

The new HPC cluster has newer, better hardware, and runs RHEL 8 (a newer version of the base operating system). Notably, there will be a number of Nvidia A100 GPUs available--these are much faster than the existing K80s. the same name, Talapas (pronounced ta-la-pas) but with newer hardware and a newer operating system. Although some things have changed, most changes are for the better, and most software should continue to “just work”.

The least you need to know

  • Access is now allowed only via the Talapas VPN. See below for connection instructions.

  • Talapas login nodes are now behind a load balancer. This means that ‘tmux’, ‘screen’, and other long-running server processes will no longer work as before. See below.

  • The partitions have changed. You can see them with the ‘sinfo’ command, and the naming is intuitive. The time limits are currently as on the existing Talapas.

  • Default memory for all jobs is now 4GB. If your job needs more, you will need to explicitly request it.

  • Depending on how existing GPU software was compiled, it may need to be recompiled or upgraded to work with the newer GPUs.

  • In some cases, RHEL shared library changes or other things may break existing software. File a ticket, and we’ll get it fixed ASAP.

Logging in to the new Talapas

Talapas VPN is now required

In order to access the new cluster in any way, your laptop/etc. will need to be on the Talapas VPN. This VPN is a lot like the UO VPN, if you’ve used that before, and is intended to provide the same capabilities as the UO VPN, but also provides access to Talapas.

To do this, follow the instructions

Notable updates

  • Red Hat Enterprise Linux 8 (RHEL8) operating system

  • Intel (Ice Lake) and AMD (Milan) 3rd generation processors

  • Optane and DDR4 3200MT/s memory

  • Nvidia 80GB A100 PCIe4 GPUs

  • More storage in user home directories and job I/O scratch space

Login

Duckid

UO Duckid is required to access the cluster. There are no ‘local’ accounts.

Talapas uses UO identity access management (Microsoft Active Directory) and therefore requires users to have a valid UO Duckid. Links are provided below for external collaborators or graduating researchers to continue their access to the cluster.

Talapas VPN

Talapas VPN is required to access the new cluster. The Talapas VPN should provide all the same capabilities as UO VPN as well as access to Talapas.

Instructions here: Article - Getting Started with UO VPN (uoregon.edu) but use

Use uovpn.uoregon.edu/talapas" as the connection URL. The username and password are your standard DuckID and its password.

...

  • Hostnames now use the long form. (e.g., “login1.talapas.uoregon.edu”)

  • You may need to use the long form of hostnames to access other campus hosts. That is, using “somehost” may not work, but “somehost.uoregon.edu” will.

  • Linux group names have changed and are now longer. For example, “is.racs.pirg.bgmp” instead of “bgmp”. Since this information is now coming from the campus Active Directory server, there are a number of other mysterious AD groups included. You can just ignore these.

  • Currently, lookup of group names can be quite slow, taking 30 seconds or longer. We’ll work on speeding this up.

  • Generally, RACS is discouraging the use of POSIX ACLs on the new cluster. You can still use them yourself, but there are now other alternatives. If you’re tempted to use ACLs to solve a problem, consider asking about the alternatives.

  • In RHEL 8, the distribution executables seem to be fully stripped, removing all debug symbols. There’s probably an alternate way to add this, and we’ll look for it eventually.

The least you need to know

  • Access is now allowed only via the Talapas VPN. See below for connection instructions.

  • Talapas login nodes are now behind a load balancer. This means that ‘tmux’, ‘screen’, and other long-running server processes will no longer work as before. See below.

  • The partitions have changed. You can see them with the ‘sinfo’ command, and the naming is intuitive. The time limits are currently as on the existing Talapas.

  • Default memory for all jobs is now 4GB. If your job needs more, you will need to explicitly request it.

  • Depending on how existing GPU software was compiled, it may need to be recompiled or upgraded to work with the newer GPUs.

  • In some cases, RHEL shared library changes or other things may break existing software. File a ticket, and we’ll get it fixed ASAP.