Environment modules are used to manage the software packages on available on Talapas.
Environment Modules
Environment modules allow us to make multiple versions of the same software available on a single system. For example, a particular application may require GCC version 5.4 to compile correctly, while another may need version 6.1. Using environment modules, we can keep both versions on the same system, and users can switch between the two with minimal effort. We use the LMOD module software from TACC on Talapas. For information on using modules to load and unload software see our how-to guide on modules.
LMOD and Hierarchical Modules
One of the advantages of LMOD over the classical module software found on many legacy systems is its hierarchical nature. Modules are aware of their compiler and MPI dependencies and can only be loaded if their dependencies are already loaded. There are hundred if not thousands of software packages and library modules for various programming languages installed on Talapas. The good news is that whatever you need is likely to already be present. (And if not, we can often quickly add it.) The bad news is that sometimes it can be difficult to find.
The software on Talapas is available via a number of mechanisms. This is partly due to the historical evolution of Talapas, and partly because using multiple mechanisms allows us to get software running for you as quickly as possible.
The fundamental mechanism is called "Environment Modules". Layered on that are several alternative mechanisms: Anaconda, Spack, and EasyBuild. In addition, some programming languages like Python and R provide their own mechanisms for adding libraries (and sometimes commands).
Environment Modules
This mechanism provides a way to allow multiple versions of a piece of software to be installed at the same time. We use the most popular implementation, Lmod. The multiple software versions are installed in different locations, and the "module load" command modifies $PATH
to make them available in a particular shell. In practice, other similar environment variables may also be modified, and occasionally shell aliases are added. The sum effect is to make the package available as with a traditional install.
To be concrete, you can do something like this to load a particular version of MATLAB:
Code Block | ||
---|---|---|
| ||
$ module avail matlab
------------------------------- /packages/modulefiles/Core -------------------------------------
matlab/R2016b matlab/R2017b matlab/R2018b (D) matlab/R2019b matlab/R2020b
Where:
D: Default Module
Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".
$ module load matlab/R2020b |
Once you've done this, the 'matlab' command for that version will be in your path, and running it will run that version. More information is available in our how-to guide on modules.
Some users like to load modules in their "dot" files so that they will always be present. Although this has advantages, we recommend against it in general. Instead, consider loading modules as you need them, either in your shell, via a 'source' file, or in your SLURM scripts. In the long run, this is more reliable and reproducible.
Anaconda
We also have software provided by Anaconda. This is compiled upstream and is nice because users can easily also install similar software on their own machines. (Usually, Linux, Windows, and Mac versions are available.)
The most common way to use this is to load 'tensorflow' or 'tensorflow2'. These modules actually load 'miniconda' and then activate an Anaconda environment that provides a lot of software, particularly software useful for machine learning and other scientific computing tasks. As you might expect, 'tensorflow' provides TensorFlow version 1, and 'tensorflow2' provides TensorFlow version 2. Both have 'numpy', 'scipy', and many other similar Python libraries.
More generically, you can just load 'miniconda'. If you then run 'conda env list', you will see a list of independent conda environments that provides various sets of software. Some have names indicating the primary software they provide. You can list the contents of each environment, or perhaps easier, just ask us.
There are also legacy modules 'anaconda3' and 'anaconda2'. These are older Anaconda installs, and usually you should avoid unless you are already using them. For the most part, we no longer update these. (Notably, most software is installed in the 'base' environment, which can cause problems.)
Spack
This mechanism provides a way for us to compile a lot of software locally. It's more complex and harder to use, and usually we prefer Anaconda when that's a solution. See the documentation for Spack.
To enable these packages, type 'module load racs-spack'. Then use 'spack find' and 'spack list'. To see what's already install and what could be installed, respectively.
EasyBuild
EasyBuild is a build system similar to Spack, but RACS no longer uses it. Our legacy packages are still available using 'module load racs-eb', but generally you should avoid them, as we no longer update these packages, and they are quite old at this point.
There is a second EasyBuild set of packages available via 'module load easybuild'. This is a community tree updated and supported by Jason Sydes. We can answer simple questions, but generally support is provided by him. Unless you were advised to load this for a class or research group, this is available on an as-is basis.
User Software
It's worth mentioning that you are welcome to install any software you like on Talapas (as long as licenses are complied with). Given the quotas on home directories, usually you will want to put this in your '/projects' directory. If you intend for other members of your lab to also use it, 'shared' is an ideal location. We'd be happy to provide specific advice.
The main advantage is that you can do what you like, as quickly as you can. A potential downside is that it's harder for us to help if something goes wrong.
Details
The most important thing to realize is that any given command can only come from one source. Many options above provide the 'python3' command, for example, but if you load more than one, your results may be unreliable. As a first cut, usually the last thing loaded "wins" in case of a conflict. The full story is more complicated, though, and if you load clashing packages, it's not difficult to end up in a situation with subtle failures. In any given shell (or SLURM script), it's best to load as few modules as possible.
Related articles
Filter by label (Content by label) | ||
---|---|---|
|
...