OS upgrade for klone
Nam Pho
Director for Research Computingnote
klone
has a new OS, we upgraded to Rocky Linux from CentOS 8.
#
BackgroundIn late 2020, while building the current-generation cluster, klone
, our previous-generation cluster, MOX, was running CentOS 7 â which was nearing end-of-life support. We used the transition to klone
as an opportunity to deploy CentOS 8, the worldâs most popular OS in academic research computing environments. Unfortunately, around the time we were wrapping up KLONEâs software stack, the CentOS project announced [1, 2] a transition of their own: Red Hat unilaterally terminated the development of CentOS as an open-source version of Red Hat Enterprise Linux (RHEL). CentOS would become an upstream version of RHEL â in other words, more experimental and ultimately less stable.
As the dust from this announcement settled, a consensus emerged: Rocky Linux, led by the initial founder of the CentOS project, Greg Kurtzer, would become the CentOS successor.
#
The TransitionFast-forward to late 2021: after our summer â21 launch of klone
, and our fall â21 cluster capacity expansion, we were finally able to turn our attention to the CentOS to Rocky migration. And just in time, too, because CentOS 8âthe operating system we deployed just months earlierâwould be officially unsupported after December 21, 2021.
Our goal was to make this OS transition as smooth and unnoticeable to our users as possible. After all, this is our mission: we take care of the tech so that you can take care of the science. Rocky, like CentOS, is intended to be a bug-for-bug, open-source version of RHEL, and with its talented, globe-spanning team of developers, we were confident that the impact of this transition would be minimal.
We began the transition with our backend during the December â21 maintenance: the klone
head node, our Slurm scheduler, was successfully migrated to Rocky 8. So far so good! During our next maintenance, January â22, we migrated all the compute node images to Rocky. A handful of users reported code-compiling issues, which we were able to resolve, but otherwise it was uneventful. We took extra care on the final piece of the Rocky migrationâthe login nodesâdue to their accessibility from the wider internet. And, as of todayâs maintenance, we are excited and relieved to report that klone
is now a 100% Rocky cluster! đ„ł
#
SummaryThe Hyak team was forced to revisit a major OS migration, mere months after the initial launch of klone
. This is highly unusualâand no small featâbut we have prevailed. We deployed a widely-supported, open-source OS with enterprise-level stability, while remaining cost-effective to the research community at the University of Washington. With this work behind us, weâve arrived at a sustainable platform for the life of the klone
cluster. Weâre excited for the future of klone
, and excited to redirect our time back to feature development.
We want to give a huge thank-you to our users for their patience during this migration period. Spoiler alert: Rocky won and itâs a good thing!