Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Info

The new HPC cluster has the same name, Talapas (pronounced TA-la-pas) but with newer hardware and operating system. Although some things have changed, most changes are for the better.

Notable updates

  • New operating system - Red Hat Enterprise Linux 8 (RHEL8)

  • New processors - 3rd generation Intel (Ice Lake) and AMD (Milan) 3rd generation processors

  • New GPUs - Nvidia 80GB A100 PCIe4 GPUs in 10GB, 40GB, and 80GB MIG slicesOptane and A100s

  • Faster memory - DDR4 3200MT/s and Optane memory in our memory the ‘memory’ partitions

  • More storage in user - 250GB home directories (250GB) and a 10TB scratch space for job I/O (10TB per PIRG)

Login

Duckids

Talapas uses UO Identity Access Management system, Microsoft Active Directory, for authentication which requires all users to have a valid UO Duckid.

...

  • /projects/<pirg>/<user>: store your PIRG specific work files here.

  • /projects/<pirg>/shared: store PIRG collaboration data here.

  • /projects/<pirg>/scratch: store job specific I/O data here. Each PIRG has 10TB quota and this directory is purged every 30 days (any data older than 30 days is deleted).

Software

Some existing software will run fine on the new cluster. But, with the operating system change update to RHEL8 there will likely be problems that require re-compiling or rebuilding.

Generally, issues would be due to shared library differences differences with the new shared libraries in RHEL 8 and perhaps CPU architecture issues. For the latter, it’s important to note that the new login nodes have a newer CPU architecture than some of the compute nodesCPU architectures. If you compile software on a login node in a way that specifically assumes that architecture, one architecture (i.e. Intel IceLake) it might not run on all of the compute nodes. (Typically, you’ll see an “Illegal instruction” error in that case.)

Conda

In addition to the original ‘miniconda’ instance, we now have a ‘miniconda-t2’ instance. To avoid compatibility issues, we will create and update Conda environments only in the latter instance on the new cluster. (Similarly, we won’t make updates on the original instance on the new cluster.) If you have personal conda environments, you might wish to follow a policy like this as well. Note that using existing Conda environments on either cluster should work fine--it’s making changes that might cause problems.

...