Start Here
In a large, distributed environment such as an HPC cluster, a scheduler is required to request compute resources and dispatch user processes (called "jobs") on them. There are several available for HPC systems but the most common one is called Simple Linux Utility for Resource Management or "Slurm" for short [Wikipedia]. The Hyak clusters use Slurm so any general documentation you can find on the internet is valid, including directly from the developer [SchedMD]. While SchedMD documentation is the most authoritative, this section has some common examples you might find useful with sample arguments specific to the klone
cluster of Hyak.
#
AccountsEvery user is part of an account and thus has access to certain partitions. Your account is usually related to a lab or research group that you belong to; for example, you may be part of a lab group that has contributed resources to Hyak, affording you priority usage of those resources, which are organized into one or more partitions. Alternatively, you may be a student user who is part of the Research Computing Club, or account stf
, meaning that you have priority access on the stf
account, which allows you to use several partitions.
Pro Tip - Get an STF account
If you are a student who is paying the student technology fee (STF), you are eligible for an stf
account which will increase your access and user experience on Hyak because there are designated resources for students. Click here to find out how to get an STF account. NOTE: The Hyak Team doesn't manage the stf
account group.
#
PartitionsThe account(s) you are a part of determine the priority access you have to certian partitions. All users can use Hyak resources when they are idle by scheduling jobs on the ckpt
, ckpt-g2
, or ckpt-all
parititions (Click here to learn about more about ckpt
jobs.).
#
What Resources Do You Have?The hyakalloc
command allows users to see which accounts and partitions they are a part of and the current utilization of these resources. Resource limits are directly proportional to what was contributed by that group. By default, the output of hyakalloc
might look something like this:
Note that hyakalloc
shows all the idle resources on the checkpoint (ckpt
) partition you have access to. For a demo account, the output to hyakalloc
should look something like this:
Users can use several optional arguments with the hyakalloc
command to execute specific actions. A list of all optional arguments will print to your screen with the hyakalloc --help
command:
The sinfo
command allows users to view information about all partitions on Hyak. Similarly to hyakalloc
, the sinfo
command supports a variety of optional arguments, allowing for more complex commands. For example, sinfo -s
will summarize the default output of sinfo
by printing out a more concise and readable list of partitions and their corresponding host names.
Use sinfo --help
for a detailed list of optional arguments.
The following sections will walk through how to request and schedule jobs using Slurms salloc
and sbatch
commands. As you go, keep using the hyakalloc
command to remember which accounts and partitions you can access because this determines the resource types you can request.