Skip to main content

Containers

Have you ever found yourself saying:

  1. I don't want to use Hyak because I can't install whatever I want? I found this command online and I can't setup the random "program" package with sudo apt-get install program? Hate Hyak because you somehow want to use Ubuntu?

  2. I'm trying to work on my local computer then move it to Hyak and resume without friction? How about you set up something and now want to share that exact compute environment with a collaborator (or vice-versa)?

The answer to either (or both) of those things are containers! Software containers are a way to package everything you need into a file to send around and have it work exactly the same across environments. The most popular containerization format is Docker but that does require administrative access to run natively, so for shared platforms (e.g., HPC clusters like Hyak) an alternative called Apptainer was developed. Almost every Docker container can be seamlessly converted to Apptainer so they're effectively interchangable.

What are the costs, trade offs, or downsides? You might imagine performance or that containerized applications run slower than native ones. This is not always true, in most instances they are near equivalent.

Apptainer (formerly Singularity)

March 2022: 'Singularity' became a Linux Foundation supported project and was renamed 'Apptainer' [www].

The official Apptainer documentation [www] is the best source.

Ubuntu apt-get Example

Let's say that you want to use git and the current version of git on Hyak is 1.8.3.1 as shown below.

$ which git
/usr/bin/git
$ git --version
git version 1.8.3.1
$

Let's say you want a newer version AND you also want it running on Ubuntu for some reason. Here we'll walk you through installing the latest git binary using apt repositories for Ubuntu 16.04 [www] or "Xenial Xerus".

  1. Get an interactive session using some variant of the below command.
salloc -A mygroup -p compute -N 1 -n 2 --mem=10G --time=1:00:00
  1. Load the Apptainer module.
module load apptainer
  1. Create a Apptainer definition file. Mine is below called tools.def to install the latest curl and git binaries from the Ubuntu repositories. Please see the Apptainer definition files reference page [www] for more advanced options.
Bootstrap: docker
From: ubuntu:16.04
%post
apt -y update
apt -y install curl git

  1. Build a Apptainer container from its definition file. The generated SIF file is your portable container.

    The .def definition file should either be A) executable or B) a relative path (e.g. ./tools.def while in the same directory as the file) or an absolute path (e.g. /full/path/to/tools.def).

    When using the --fakeroot option, build the container image in /tmp. This avoids [a potential permission issue] with our shared storage filesystem, GPFS.

apptainer build --fakeroot /tmp/tools.sif ./tools.def
Disk Quota Exceeded

When building a container, you may encournter the following error:

If you run into a Disk Quota Exceeded error when building the container, it is likely due to exceeding the stroage limit in your home directory where the Apptainer cache is located by default. Because your home directory has a 10GB storage limit, the following commands may be useful to monitor your storage usage. To assess your storage in your home directory, use the following command:

du -h --max-depth 1

If you find your storage exeeding the 10GB quota, you will need to eliminate storage. It can be helpful to clear the Apptainer cache with the following:

apptainer cache clean

Default Apptainer Cache

You can now start up a job on a compute node with salloc if you have not already. When you start a job on a compute node, the internal storage of that node is available to be used for temporary read/write operations with the jobs. This makes it a great place for the apptainer cache and the physical architecture of our filesystem doesn't interfere with this. You can configure Apptainer to store its cache in a directory located on the local storage of the compute node with:

export APPTAINER_CACHEDIR=/scr
# or
export APPTAINER_CACHEDIR=/tmp

You can now proceed with building your container:

apptainer build container-image.sif container-recipe.def
  1. Move the container image to the desired location and run the binary.

You'll do this through the apptainer executable to distinguish it from the git binary in the main operation system.

$ mv /tmp/tools.sif .
$ apptainer exec tools.sif git --version
git version 2.7.4
$

Notice the git version here is newer than the original we started with. Success!

Container Repositories

If you followed the tutorial above you should be able to install anything you want but why re-create the wheel? There is a large developer community out there that maintains a majority of the most common scientific applications.

Docker Hub

Screenshot of Docker Hub's "Container" page

The biggest collection of Docker images is from Docker Hub [www].

Let's say Docker Hub tells you the pull command for the container you want is docker pull gcc:11.1.0-bullseye. To have Apptainer grab this Docker container and convert it to a Apptainer container you'd modify the command to be apptainer pull docker://gcc:11.1.0-bullseye.

NVIDIA GPU Cloud (NGC)

Screenshot of NVIDIA GPU Cloud "Containers" page

A container registry that specializes in common GPU accelerated applications or GPU software development tools is provided by NVIDIA called the NVIDIA GPU Cloud (NGC) [www]. For example, you might want to use a PyTorch container optimized for NVIDIA GPUs as seen below.

Screenshot of PyTorch container page

Depending on the NGC container, it might have directions on the exact pull command for Apptainer. If it does not work be sure to prepend their pull location with docker:// since these are native Docker containers that need to be converted to Apptainer.

The example above provides a Docker pull command for PyTorch but in this case you'd modify it similarly as if you got it from Docker Hub from docker pull nvcr.io/nvidia/pytorch:21.05-py3 to apptainer pull docker://nvcr.io/nvidia/pytorch:21.05-py3.

Biocontainers.pro

Screenshot of Biocontainers.pro homepage

A bioinformatics focused set of Apptainer containers can be found at the Biocontainers.pro registry [www]. It is a collection of (convertible to Apptainer) Docker containers as well as native Apptainer containers.

Sylabs.io Cloud Library

Screenshot of Sylbas.io Cloud Library Page

The largest collection of native Apptainer containers can be found at the Sylabs.io Cloud Container Library [www]. This would be the ideal first place to look for containers built by others since it is maintained by the creators of Apptainer and provides the native container format.

NGC API Keys

In rare occasions a container in the NGC app store is going to require that you have an API. This is the only reason you'd need to register for a user account with NGC. Once you have an NGC account and are logged in, at the top right the pull down menu select "Setup" and there's an option to "Get API Key". Save that string of text.

You only have to register your API key once but load the ngc module (i.e., module load ngc) and run ngc config set which will prompt you for your API key. It's fine to select "ascii" as an option. Your API key will be stored under a .ngc/config file in your home directory.

$ ngc config current
+-------------+----------------------------------------------------------+--------------------+
| key | value | source |
+-------------+----------------------------------------------------------+--------------------+
| apikey | ******************************************************** | user settings file |
| | ************************YTc3 | |
| format_type | ascii | user settings file |
+-------------+----------------------------------------------------------+--------------------+
$