April 2025 Maintenance Update

Kristen Finch

Kristen Finch

HPC Staff Scientist

April maintenance included the latest OS patches and security enhancements for login and compute nodes, along with testing endpoint detection and response (EDR) software to strengthen cluster security. We would like to bring your attention to upcoming events featuring an AWS AI in Research Workshop (April 17), our annual Research Computing Club Hackathon (April 19), and the Kopah S3 Storage Launch Event "Data Storage Day" (May 5), showcasing campus-wide S3-compatible storage. Additionally, GPU in EDU (May 15) will provide insights into GPU workflows demonstrated by experts at Cambridge Computer and NVIDIA. Regular office hours are available for research computing support. Stay informed through training resources, event subscriptions, and the UW-IT Research Computing Events Calendar. The next maintenance is scheduled for Tuesday May 13, 2025 (AKA the 2nd Tuesday of the month).

Notable Updates, New Features#

  • Operating system - The login and compute node images were updated to address system patches and security updates in the Linux operating system (OS).
  • Security solutions testing - We tested endpoint detection and response (EDR) software as a potential solution for an enhanced cluster security posture. This testing is part of our ongoing efforts to maintain a secure computing environment in line with funding requirements (e.g., NIST-800, CMMC, HIPAA, NIH).
  • Globus - We’re excited to announce that Globus has been added to Hyak Klone and Kopah S3. Globus makes it easy to transfer large datasets reliably and securely between systems, whether across campus or around the world. With features like automated transfers, fault tolerance, and a simple web interface, it’s a powerful tool for streamlining data movement in research workflows.

Upcoming Training and Events#

Subscribe to event updates and bookmark our UW-IT Research Computing Events Calendar.

  • AWS AI in Research Workshop - UW-IT Research Computing and eScience Institute will host a workshop introducing how AWS enables AI in research on Thursday, April 17th from 12:00 - 4:00 p.m. in the eScience Institute. Whether you are developing your own models, building off existing models, or you need to implement generative AI, AWS has you covered. Bring your laptop and join us for a couple of engaging sessions followed by a hands-on lab building AI agents. Learn more.

  • RCC Hackathon - The Research Computing Club (RCC), in collaboration with UW-IT Research Computing and the eScience Institute, is excited to host an all-day HPC Hackathon on Saturday, April 19th from 10:00 a.m. – 6:00 p.m. in the eScience Institute. Learn more.

  • Discover Kopah: Your New S3-Compatible Campus Storage Solution at Data Storage Day! Join us on Monday, May 5 from 1:00 – 5:00 p.m. in the eScience Institute for the official Kopah S3 Storage Service Launch Event — your gateway to fast, scalable storage right on campus. Whether you're a researcher handling big data, an instructor managing course materials, or simply someone in need of reliable cloud-like storage, Kopah is designed for everyone. This event will feature live demos of S3-compatible tools like s3cmd, Globus, JuiceFS, Cyberduck, and boto3, and staff will be available to help you get started with Kopah S3. Learn more.

  • GPU in EDU Save the Date Thursday May 15th 10:00 a.m. - 4:00 p.m. – Join UW-IT Research Computing for a day learning about GPUs from the experts at NVIDIA and Cambridge Computer. This event will feature live demonstrations for building your GPU workflows, recommendations for incorporating GPUs into instruction, computing research at UW, and more! Registration link coming soon!

Spring Office Hours#

If you would like to request 1 on 1 help, please send an email to help@uw.edu with "Hyak Office Hour" in the subject line to coordinate a meeting.

Training Resources#

Opportunities#

External Training Opportunities#

  • Parallel Computing with MATLAB: Hands on workshop Coming Up April 16 1PM PDT - During this hands-on workshop, we will introduce parallel and distributed computing in MATLAB with a focus on speeding up application codes and offloading computers. By working through common scenarios and workflows using hands-on demos, you will gain a detailed understanding of the parallel constructs in MATLAB, their capabilities, and some of the common hurdles that you'll encounter when using them. SIGN UP! Highlights:

    • Multithreading vs multiprocessing
    • When to use parfor vs parfeval constructs
    • Creating data queues for data transfer
    • Leveraging NVIDIA GPUs
    • Parallelizing Simulink models
    • Working with large data
  • AI in Science Postdoc Workshop - Generative AI systems built upon large language models (LLMs) show great promise as tools that enable people to access information through natural conversation. Scientists can benefit from the breakthroughs these systems enable to create advanced tools that will help accelerate their research outcomes. Last week the UW Scientific Software Engineering Center (SSEC) offered a Generative AI / RAG training opportunity to the Schmidt AI in Science Postdoctoral Fellows. The online tutorial focused on how to utilize the underlying methods in Generative AI to advance scientific research, including the basics of LLMs followed by a demo of using LLMs and RAG for creating a question answering tool based on private data. Learn more.

  • Trainings from Texas A&M University - gain skills that are transferrable to Hyak (Linux, Slurm job scheduler, Open OnDemand, NVIDIA GPU).

    • ACES: Python for Data Science 04/11/25 - 10:00 AM - 04:00 PM CDT - This short course for experienced programmers introduces the Numpy, Pandas, and Matplotlib libraries commonly used to manage and display large datasets in Python. Exercises will be performed in the learner's web browser in Jupyter Notebooks running in the Open OnDemand portal of the ACES cluster. More information about this course at https://hprc.tamu.edu/training/aces_python4data.html.
    • ACES: Metagenomics 04/15/25 - 01:30 PM - 04:00 PM CDT - This short course introduces concepts of metagenomic analyses based on data generated by Next Generation Sequencing (NGS) technology using ACES cluster, a composable accelerator testbed at Texas A&M University. Students will learn to complete a metagenomics pipeline using QIIME2 software. Read more at https://hprc.tamu.edu/training/aces_metagenomics.html
    • ACES: Rust 04/22/25 - 10:00 AM - 12:30 PM CDT - This short course provides an introduction to Rust, a modern systems programming language known for its safety, speed, and concurrency, through hands-on coding exercises and practical examples using the ACES cluster, a composable accelerator testbed at Texas A&M University. Learn more at https://hprc.tamu.edu/training/aces_rust.html
    • ACES: GPU Programming (CUDA) 04/22/25 - 01:30 PM - 04:00 PM CDT - This short course covers basic topics in CUDA programming on NVIDIA GPUs. More information about this Short Course https://hprc.tamu.edu/training/intro_cuda.html Topics include:
      • CUDA architecture
      • basic language usage of CUDA C/C++
      • writing, executing CUDA code.
    • ACES: Python for HPC and Advanced Topics 04/25/25 - 10:00 AM - 04:00 PM CDT - This short course for experienced Python programmers will cover several topics relevant to Python workloads running on HPC systems, including environment and data handling. Exercises will be performed in the learner's web browser in Jupyter Notebooks running in the Open OnDemand portal of the ACES cluster. More information about this course at https://hprc.tamu.edu/training/aces_python4HPC.html.
  • Trainings from the National Energy Research Scientific Computing Center (NERSC) - gain skills that are transferrable to Hyak

    • NERSC N-Ways to GPU Programming Bootcamp - NERSC, in collaboration with the OpenACC Organization and NVIDIA, is hosting an N-Ways to GPU Programming Bootcamp for 3 days from Wednesday, April 30 to Friday, May 2. Beginner users in GPU programming are especially encouraged to attend. This is a virtual event. For detailed information, and how to apply, please refer to Open Hackathons's Bootcamp Events page. The application deadline is April 13, 2025.
    • Building GPU-Accelerated Differentiable Simulations with NVIDIA Warp Python 1- 4 pm (Pacific time), Wednesday, May 28, 2025 - LANL is hosting a Building GPU-Accelerated Differentiable Simulations with NVIDIA Warp Python training, presented by Eric Shi from Nvidia. Warp is a Python framework designed for authoring high-performance, GPU-accelerated code directly in Python. At its core, Warp uses a programming model where Python functions are just-in-time (JIT) compiled into efficient code that can run on both CPUs and NVIDIA GPUs, using C++/CUDA as an intermediate representation. This approach lets developers harness GPU performance while maintaining the simplicity and flexibility of Python. Learn more and register.
    • NERSC GPU Hackathon - NERSC, in conjunction with NVIDIA and the OpenACC organization, will be hosting an Open Hackathon from July 16th-18th with an opening day on July 9th as part of the annual Open Hackathon Series. The NERSC Open Hackathon will be hosted as a hybrid event. Days 0-1 (July 2 and 9) will be held virtually, and attendees will have the option to attend Days 2-4 (July 16 - 18) either in-person at NERSC or online. This hackathon is open to everyone looking to take their projects to the next level; however, priority acceptance will be given to NERSC collaborators. Please note the deadline to submit a proposal is 11:59 PM Pacific, May 28, 2025. So apply now! Please note acceptance is not confirmed until you have received a confirmation email. Learn more.
    • More opportunities from NERSC

Jobs#

  • The UW Echospace research group is seeking a student or an early career professional to help develop open-source scientific software for processing large-scale ocean sonar data to accelerate marine ecosystem research. APPLY!

  • Metascience & AI Postdoctoral Fellowship - The Sloan Foundation is awarding grants of up to $250k to support early career researchers in the social sciences and humanities who are interested in understanding the implications of AI for the science and research ecosystem. Applicants are required to have a faculty mentor based at the research organization where the grant is to be held. Feel free to reach out to eScience for support with identifying a potential UW mentor. Applications due April 10, 2025! Learn more.

  • The Center for Geospatial Intelligence at George Mason University is looking for a Postdoctoral Research Fellow in geospatial data science and machine learning to work on research projects related to large-scale urban simulations and geospatial machine learning. APPLY!

  • Yale Center for Research Computing is hiring an HPC System Administrator to join the center’s Engineering team to provide hardware and software administration for a growing number of high performance computing (HPC) clusters used in faculty research. This position will support the infrastructure behind all of the above, including hardware, system and resource-management software, networking, storage, monitoring and security measures. This is a highly-collaborative effort, so frequent interaction with other system administrators, research support staff, management, vendors and researchers is a regular part of the role. Learn more.

  • Job opening for a High Performance Computing (HPC) Research Computing Associate at Colby College in Maine. Learn more.

If you have any questions about using Hyak, please start a help request by emailing help@uw.edu with "Hyak" in the subject line.

Happy Computing,

Hyak Team

March 2025 Maintenance Details

Kristen Finch

Kristen Finch

HPC Staff Scientist

March maintenance is complete, and Klone is back in operation. Notable updates from this maintenance include new features and documentation (VS Code via Open OnDemand), upcoming events (SAVE THE DATE for Hackathon and GPU Day), newly-released training videos (Advanced Slurm), office hours, and opportunities from the national computing community! The next maintenance is scheduled for Tuesday April 8, 2025 (AKA the 2nd Tuesday of the month).

Notable Updates#

  • Operating system - The login and compute node images were updated to address system patches and security updates in the Linux operating system (OS).
  • Hardware - The HPC engineering team used this scheduled time to perform what would have otherwise been disruptive tasks by physically re-arranging equipment within the data center in preparation for a future (GPU) cluster launch. A new rack was also prepared for a bulk expansion of HPC compute node capacity.

New Features and Documentation#

  • Our GPU documentation section has an additional worked example for preparing a software container to utilize Hyak GPUs. Check out the new content.
  • VS Code has been added as an interactive application via Hyak’s Open OnDemand Platform. Go to Open OnDemand(VPN required if off campus) and view the supporting documentation.
  • Have you ever wanted to contribute documentation or tutorials to support your fellow Hyak users? By preparing a GitHub CodeSpace and containerized development environment, we have made contributing to Hyak’s documentation website and repository more accessible. We invite and appreciate your feedback and contributions, no matter how small. Check out our guide for using the development container and submitting your contributions via a Pull Request.

Upcoming Training and Events#

  • The Research Computing Club (RCC) is hosting a Hackathon on Saturday April 19 10am-5pm. The RCC officers have prepared a set of modules for participants to work through together in small groups. Lunch will be provided. Registration is required. Additional details coming soon.
  • GPU Day Thursday May 15 SAVE THE DATE! hosted by UW-IT Research Computing featuring hands-on demonstrations from Cambridge Computer and NVIDIA. Time and venue TBA. Stay tuned for more details and links to register.

New training videos uploaded#

Hyak: Advanced Slurm recording from training on February 25. This tutorial demonstrates these benefits of Slurm and provide you with some template Slurm scripts that you can adapt for your purposes. Click here for walk through tutorial and training materials.

Check out our Research Computing Training Playlist on UW-IT's YouTube channel.

Winter 2025 Office Hours#

If you would like to request 1 on 1 help, please send an email to help@uw.edu with "Hyak Office Hour" in the subject line to coordinate a meeting.

Opportunities#

  • We're Hiring! The Hyak team is hiring an HPC Staff Scientist to join our team. Use your experience to help Hyak users build capacity especially as we launch out new GPU system this fall. Job description HERE.

  • Join the UW AI Community of Practice on MS Teams to get updates from UW-IT's AI team about events and join the discussion around AI in the news, society, and culture.

  • The Accelerated AI Algorithms for Data-Driven Discovery (A3D3) Institute funded by the National Science Foundation (NSF), under the Harnessing the Data Revolution (HDR) program, is seeking postbaccalaureate research fellows to join our interdisciplinary teams of scientists and engineers to develop and deploy artificial intelligence to accelerate science discoveries in particle physics, astrophysics, biology, and neuroscience. APPLY!

  • The Alfred P. Sloan Foundation is providing a postdoctoral fellowship program to support early career researchers in the social sciences and humanities who are interested in building a career in understanding the implications of AI for the science and research ecosystem. APPLY!

  • eScience is seeking current and incoming UW postdocs who are actively involved in developing or utilizing advanced data science/AI tools and techniques in their research at the UW. Applications are due Saturday, March 15th. APPLY!

  • Tania Malik, a lecturer at the School of Informatics and Cybersecurity, and a director of HPC Nexus lab at TU Dublin, Ireland is looking to support one postdoc under the SUSTAIN-FIT Horizon Europe Programme. If you are passionate about Energy-Efficient High-Performance Computing (HPC) or Green Computing for modern HPC platforms feel free to contact tania.malik@tudublin.ie with your CV and a brief research proposal.

If you have any questions about using Hyak, please start a help request by emailing help@uw.edu with "Hyak" in the subject line.

Happy Computing,

Hyak Team

February 2025 Maintenance Details

Kristen Finch

Kristen Finch

HPC Staff Scientist

Our February maintenance is complete, and Klone is back in operation. The next maintenance is scheduled for Tuesday March 11, 2025 (AKA the 2nd Tuesday of the month). Below we discuss notable updates from this maintenance session, upcoming user training, newly-released training videos, office hours, and opportunities from the national computing community!

Notable Updates#

  • Security updates for firmware on all nodes.
  • Software updates for security on the operating system.
  • Increased resiliency of the job scheduler.

Upcoming Training#

Hyak: Advanced Slurm - This workshop will be held in person on Wednesday February 26 10am - 11:45am in the eScience Classroom (Address: WRF Data Science Studio, UW Physics/Astronomy Tower, 6th Floor; 3910 15th Ave NE, Seattle, WA 98195). For this tutorial, we'll use this container and a publicly available dataset to practice submitting single and array jobs to Hyak's job scheduler, Slurm. Locator is set of python tools that builds a neural network with TensorFlow to estimate the location of organisms based on their genotype (DNA; or genetic background). The neural network is trained on genotypes from a set of organisms with known location. Performance is assessed with “unknown origin” genotypes by calculating the distance between their predicted and true location. Click here to learn more and register for this event.

New training videos uploaded#

Hyak: Intro to Deep Learning recording from training on December 3. Bernease Herman from the eScience Institute introduces deep learning concepts including data representations, data importance, learning tasks, deep learning frameworks, and computing environments. This training includes a hands-on demonstration of fully connected neural networks and convolutional neural networks with MNIST digit data and a brief discussion about deep learning frameworks on Hyak.

Hyak: Open OnDemand recording from training on January 31. Open OnDemand (OOD) is an open-source web portal for HPC centers to provide users with an easy-to-use web interface to HPC clusters. For the last year, the Hyak team has been adding features to OOD. This workshop demonstrates OOD's main features such as exploring the filesystem, composing jobs, and launching interactive applications like Jupyter, Rstudio, MATLAB, and Virtual Desktop. Training presented by UW Research Computing Intern, Bhavik Soni.

Check out our Research Computing Training Playlist on UW-IT's YouTube channel.

Winter 2025 Office Hours#

If you would like to request 1 on 1 help, please send an email to help@uw.edu with "Hyak Office Hour" in the subject line to coordinate a meeting.

Opportunities#

  • COMPLECS: Linux Tools for Text Processing - Many computational and data processing workloads require pre-processing of input files to get the data into a format that is compatible with the user’s application and/or post-processing of output files to extract key results for further analysis. While these operations could be done by hand, they tend to be time-consuming, tedious and, worst of all, error prone. In this session we cover the Linux tools awk, sed, grep, sort, head, tail, cut, paste, cat and split, which will help users to easily automate repetitive tasks. We conclude by showing how large language models (LLMs) such as ChatGPT could be used to write commands using these tools. Register for this remote training.

  • The Trusted CI Student Program offers undergraduate and graduate students invaluable hands-on experience, mentorship, and opportunities to contribute to a secure and resilient digital future. As cybersecurity becomes an essential component of the scientific community, this program empowers students to gain the skills and knowledge needed to thrive in the field. The Trusted CI Student Program offers a unique opportunity for students eager to advance in the cybersecurity field. Applications for the 2025 cohort open on February 3, 2025. Click here for more details on how to apply and program information.

  • Google Cloud Skills Boost seats are available to the UW community. This platform offers online courses, labs, and certifications to help learners develop cloud skills. If you are interested in using this excellent learning resource, please request access via email to Rob Fatland at rob5@uw.edu.

  • Join the Bridge to Ocean Acoustics and Technology (BOAT) workshop and learn ocean acoustics from theory to data practice with Jupyter notebooks! The workshop is organized as part of the BOAT program which aims to grow the ocean acoustics education and research community through open-source curriculum and is funded by the ONR. Application deadline Thursday, February 20th. Apply HERE!

  • The Nawaz Lab at the UW-Applied Physics Laboratory is currently looking for a student to help apply machine learning to long term sensor data with the goal of correcting for drift. Contact Anuscheh Nawaz at anuscheh@uw.edu for more details.

  • The Microsoft AI for Good Open Call supports individuals and organizations creating positive change in Washington State Program, providing opportunities for scientific collaboration with the Microsoft AI for Good Lab. Application deadline: Monday, February 17th. APPLY

  • Text Mining Student Assistant - This student position, jointly supported by the Open Scholarship Commons and eScience, will provide technical expertise on text mining and natural language processing (NLP) to UW researchers at all levels. Apply NOW

If you have any questions about using Hyak, please start a help request by emailing help@uw.edu with "Hyak" in the subject line.

Happy Computing,

Hyak Team

JuiceFS or using Kopah on Klone

Nam Pho

Nam Pho

Research Computing

If you haven't heard, we recently launched an on-campus S3-compatible object storage service called Kopah [docs] that is available to the research community at the University of Washington. Kopah is built on top of Ceph and is designed to be a low-cost, high-performance storage solution for data-intensive research.

warning

This is a proof-of-concept demonstration and not a production-ready or officially endorsed solution, which is why we have not put a more formal walk through in our documentation.

While the deployment of Kopah was welcome news to those who are comfortable working with S3-compatible cloud solutions, we recognize some folks may be hesitant to give up their familiarity with POSIX file systems. If that sounds like you, we explored the use of JuiceFS, a distributed file system that provides a POSIX interface on top of object storage, as a potential solution.

info

Simplistically, object storage often presents using two API keys and data is accessed using a command line tool that wraps API calls, whereas POSIX is what you typically get presented with from the storage when interacting with a cluster via command-line.

Installation#

JuiceFS isn't installed by default so you will need to compile it yourself or download the pre-compiled binary from their release page.

As of January 2025 the latest version is 1.2.3 and you want the amd64 version if using from Klone. The command below will download and extract the binary to your current working directory.

wget https://github.com/juicedata/juicefs/releases/download/v1.2.3/juicefs-1.2.3-linux-amd64.tar.gz -O - | tar xzvf -

I have to move it to a folder in my $PATH so I can run it from anywhere by just calling the binary. Your personal environment varies here.

mv -v juicefs ~/bin/

Verify you can run JuiceFS.

npho@klone-login03:~ $ juicefs --version
juicefs version 1.2.3+2025-01-22.4f2aba8
npho@klone-login03:~ $

Cool, now we can start using JuiceFS!

Standalone Mode#

There are two ways to run JuiceFS, standalone or distributed mode. This blog post explores the former. Standalone mode is meant to only present Kopah via POSIX on Klone. The key points being:

  1. There is an active juicefs process required to run while you want to access it.
  2. It is intended for you to run it only on the node you are running the process from.

If you wanted to run JuiceFS on multiple nodes or with multiple users then we will have another proof-of-concept with distributed mode in the future.

Create Filesystem#

JuiceFS separates the data (placed into S3 object storage) and the metadata, which is kept locally in a database. The command below will create the myjfs filesystem and store the metadata in a SQLite database called myjfs.db in the directory where the command is run. It puts the data itself into a Kopah bucket called npho-project.

juicefs format \
--storage s3 \
--bucket https://s3.kopah.uw.edu/npho-project \
--access-key REDACTED \
--secret-key REDACTED \
sqlite3://myjfs.db myjfs

You can rename the metadata file and the filesystem name to whatever you want (they don't have to match). The same goes for the bucket name on Kopah. However, I would strongly recommend having unique metadata file names that match the file system names for ease of tracking alongside the bucket name itself.

npho@klone-login03:~ $ juicefs format \
> --storage s3 \
> --bucket https://s3.kopah.uw.edu/npho-project \
> --access-key REDACTED \
> --secret-key REDACTED \
> sqlite3://myjfs.db myjfs
2025/01/31 11:52:47.940709 juicefs[1668088] <INFO>: Meta address: sqlite3://myjfs.db [interface.go:504]
2025/01/31 11:52:47.944930 juicefs[1668088] <INFO>: Data use s3://npho-project/myjfs/ [format.go:484]
2025/01/31 11:52:48.666657 juicefs[1668088] <INFO>: Volume is formatted as {
"Name": "myjfs",
"UUID": "eb47ec30-c1f7-4a92-9b17-23c4beae7f76",
"Storage": "s3",
"Bucket": "https://s3.kopah.uw.edu/npho-project",
"AccessKey": "removed",
"SecretKey": "removed",
"BlockSize": 4096,
"Compression": "none",
"EncryptAlgo": "aes256gcm-rsa",
"KeyEncrypted": true,
"TrashDays": 1,
"MetaVersion": 1,
"MinClientVersion": "1.1.0-A",
"DirStats": true,
"EnableACL": false
} [format.go:521]
npho@klone-login03:~ $

You can verify there is now a myjfs.db file in your current working directory. It's a SQLite database file that will store your file system meta data.

We can also verify the npho-project bucket was created on Kopah to store the data itself.

npho@klone-login03:~ $ s3cmd -c ~/.s3cfg-default ls
2025-01-31 19:48 s3://npho-project
npho@klone-login03:~ $

You should run juicefs format --help to view the full range of options and customize the parameters of your file system to your unique needs but just briefly:

  • Encryption: When you create the file system and format it you can see it has encryption by default using AES256. You can over ride this using the --encrypt-algo flag if you prefer chacha20-rsa or you can use key file based encryption and provide your private key using the --encrypt-rsa-key flag.
  • Compression: This is not enabled by default and there is a computational penalty for doing so if you want to access your files since it needs to be de or re encrypted on the fly.
  • Quota: By default there is no block (set with --capacity in GiB units) or inode (set with --inodes files) quota enforced at the file system level. If you do not explicitly set this, it will be matched to whatever you get from Kopah. This is still useful for setting explicitly if you wanted to have multiple projects or file systems in JuiceFS that use the same Kopah account and have some level of separation.
  • Trash: By default, files are not deleted immediately but moved to a trash folder similar to most desktop systems. This is set with the --trash-days flag and you can set it to 0 if you want files to be deleted immediately. The default here is 1 day after which the file is permanently deleted.

Mount Filesystem#

Running the command below will mount your newly created file system to the myproject folder in your home directory. It does not need to previously exist.

juicefs mount sqlite3://myjfs.db ~/myproject --background
warning

The SQLite database file is critical, do not lose it. You can move its location around afterwards but it contains all the meta data about your files.

This process occurs in the background.

warning

Where you mount your file system the first time is where it will be expected to be mounted going forward.

npho@klone-login03:~ $ juicefs mount sqlite3://myjfs.db ~/myproject --background
2025/01/31 11:57:01.652279 juicefs[1690855] <INFO>: Meta address: sqlite3://myjfs.db [interface.go:504]
2025/01/31 11:57:01.654920 juicefs[1690855] <INFO>: Data use s3://npho-project/myjfs/ [mount.go:629]
2025/01/31 11:57:02.156898 juicefs[1690855] <INFO>: OK, myjfs is ready at /mmfs1/home/npho/myproject [mount_unix.go:200]
npho@klone-login03:~ $

Use Filesystem#

Now with the file system mounted (at ~/myproject) you can use it like any other POSIX file system.

npho@klone-login03:~ $ cp -v LICENSE myproject
'LICENSE' -> 'myproject/LICENSE'
npho@klone-login03:~ $ ls myproject
LICENSE
npho@klone-login03:~ $

Remember, you won't be able to see it in the bucket because it is encrypted before being stored there.

Recover Deleted Files#

If you enabled the trash can option then you can recover files up until the permanent delete date.

First delete a file on the file system.

npho@klone-login03:~ $ cd myproject
npho@klone-login03:myproject $ rm -v LICENSE
removed 'LICENSE'
npho@klone-login03:myproject $

Verify the file is deleted. Go to recover it from the trash bin.

npho@klone-login03:myproject $ ls
npho@klone-login03:myproject $ ls -alh
total 23K
drwxrwxrwx 2 root root 4.0K Jan 31 12:54 .
drwx------ 48 npho all 8.0K Jan 31 13:08 ..
-r-------- 1 npho all 0 Jan 31 11:57 .accesslog
-r-------- 1 npho all 2.6K Jan 31 11:57 .config
-r--r--r-- 1 npho all 0 Jan 31 11:57 .stats
dr-xr-xr-x 2 root root 0 Jan 31 11:57 .trash
npho@klone-login03:myproject $ ls .trash
2025-01-31-20
npho@klone-login03:myproject $ ls .trash/2025-01-31-20
1-2-LICENSE
npho@klone-login03:myproject $ cp -v .trash/2025-01-31-20/1-2-LICENSE LICENSE
'.trash/2025-01-31-20/1-2-LICENSE' -> 'LICENSE'
npho@klone-login03:myproject $ ls
LICENSE
npho@klone-login03:myproject $

As you can see, we can recover files that are tracked by their delete date. You would need to copy the file back out to recover it.

Unmount Filesystem#

When you are done using the file system you can unmount it with the command below.

npho@klone-login03:~ $ juicefs umount myproject
npho@klone-login03:~ $

Remember, the file system is only accessible in standalone mode so long as a juicefs process is running. Since we ran it in the background you will need to explicitly unmount it.

Questions?#

Hopefully you found this proof-of-concept useful. If you have any questions for us, please reach out to the team by emailing help@uw.edu with Hyak somewhere in the subject or body. Thanks!

January 2025 Maintenance Details

Kristen Finch

Kristen Finch

HPC Staff Scientist

Our January maintenance is complete, and Klone is back in operation. The next maintenance is scheduled for Tuesday February 11, 2025 (AKA the 2nd Tuesday of the month).

Notable Updates#

  • Slurm update: We updated to version 24.11.0. If any user software was built against the older Slurm libraries (e.g., openmpi), then users may experience errors, and it may be necessary to rebuild their software against the newer Slurm libraries.
  • Compute node updates: The compute node OS images were updated to address any security patches in underlying core packages.
  • Mathematica module installed using the license maintained by the UW Physics Department. Learn how to launch Mathematica on Hyak.
  • MATLAB application has been added to Hyak Open OnDemand Beta. Learn how to launch MATLAB with Open OnDemand.

Upcoming Training#

Hyak: Open OnDemand – Open OnDemand (OOD) is an open-source web portal for HPC centers to provide users with an easy-to-use web interface to HPC clusters. For the last year, the Hyak team has been adding features to OOD. This workshop will demonstrate OOD's main features such as exploring the filesystem, composing jobs, and launching interactive applications like Jupyter. This workshop will be held in person 10am - 11:30am on Friday January 31, 2025 in CSE2 (Gates Center) Room 371 (3800 E Stevens Way NE, Seattle, WA 98195). Click here to learn more and register for this event.

Winter 2025 Office Hours#

If you would like to request 1 on 1 help, please send an email to help@uw.edu with "Hyak Office Hour" in the subject line to coordinate a meeting.

Opportunities#

The eScience Institute offers the annual Winter School to students and lecturers interested in developing basic skills and knowledge of the tools used in data science. Gaining literacy in topics such as Python, R, Jupyter, and reproducible environments can be beneficial beyond STEM, including areas like global or public health, public policy, social sciences, social work, international relations, and business management. Apply by January 24, 2025. Learn more!

Summer Internship Opportunity at Purdue - The Rosen Center for Advanced Computing (RCAC) is seeking students for Research Experience for Undergraduates (REU) paid internships for an 11-week onsite summer REU program. This program aims at developing the next generation workforce in advanced computing and cyberinfrastructure technologies. It offers students from diverse backgrounds the opportunity to gain the knowledge and skills necessary to build and support advanced research computing systems and scientific applications. As part of RCAC's decade long successful student apprentice program, the REU students will learn by doing, working on the National Science Foundation funded Anvil system in a team environment and mentored by cyberinfrastructure professionals. Open to undergraduate students from all backgrounds and undergraduate programs within and beyond Purdue. Each student will present their work to Purdue staff, faculty, students, collaborators, and researchers at the end of the program. Students may present at a national conference as part of the program. This onsite program at Purdue University in West Lafayette, Indiana runs from May through August. Learn more and apply NOW! Interviews start in January!

April 2-3, 2025, for AI Unlocked: Empowering Higher Education through Research and Discovery, a workshop designed for individuals across all disciplines and career stages in higher education. This workshop provides access to cutting-edge computational resources and expertise to researchers, students, faculty, and practitioners, especially those from Minority-Serving Institutions (MSIs). Hosted by the National Artificial Intelligence Research Resource (NAIRR) Pilot User Experience Working Group, this event introduces AI fundamentals, hands-on experience with pre-configured AI tasks, and guidance for tailoring AI models to your projects. Travel support is available for selected participants to the Westin Denver Downtown, and virtual attendance options are also offered. Apply now to advance your AI skills and collaboration in higher education. Apply by January 31, 2025. Acceptances will be notified by the end of February.

The University of Alaska Fairbanks (UAF) Alaska Center for Energy and Power (ACEP) summer internship is a 10-week program for students to gain hands-on research experience and skill development in the energy industry. Our program offers two internship strands: AUSI and REU. Regardless of strand, all interns will receive:

  • A specific research project with 1:1 mentorship from an ACEP researcher
  • Collaborative workspace at ACEP
  • Travel to and from Alaska
  • Field trips related to energy in Alaska

Applications due January 24, 2025.

If you have any questions about using Hyak, please start a help request by emailing help@uw.edu with "Hyak" in the subject line.

Happy Computing,

Hyak Team

December 2024 Maintenance Details

Kristen Finch

Kristen Finch

HPC Staff Scientist

Our December maintenance is complete. Thank you for your patience. The next maintenance will be Tuesday January 14, 2025.

Notable Updates#

  • Updated node images, ensuring the security and behavior you expect from Hyak klone.
  • We migrated LMOD modules and other tools and software stored in /mmfs1/sw/ to Solid State Drives (SSDs). Users may notice increased speed of modules.

Office Hours over Winter Break#

I will continue to hold Zoom office hours on Wednesdays at 2pm throughout December. Attendees need only register once and can attend any of the occurrences with the Zoom link that will arrive via email. Click here to Register for Zoom Office Hours.

December 12 will be the last in-person office hours for Fall term to be held at 2pm at the eScience Institute (address: WRF Data Science Studio, UW Physics/Astronomy Tower, 6th Floor, 3910 15th Ave NE, Seattle, WA 98195). In-person office hours will resume on Thursdays at 2pm starting January 9, 2025.

If you would like to request 1 on 1 help, please send an email to help@uw.edu with "Hyak Office Hour" in the subject line to coordinate a meeting.

Research Computing Club Officer Nominations#

The Research Computing Club (RCC) at UW is looking for nominations for Officer positions. The RCC provides essential computational resources to support students working on a wide range of research projects using high-performance computing through UW’s Hyak system and cloud platforms like AWS, Microsoft Azure, and Google Cloud. The RCC relies on student officers to continue providing these resources to the UW community and to organize community-driven events such as Hackathons and trainings. Officer positions are:

  • President
  • Director of AI/ML
  • Director of Outreach
  • Director of Hyak
  • Director of Cloud Computing

Please consider nominating yourself if you are interested, or nominating someone you know who is interested, by filling out the form linked HERE.

External Opportunities#

DOE Computational Science Graduate Fellowship- Applications are being accepted through January 16, 2025 for the Department of Energy Computational Science Graduate Fellowship (DOE CSGF). Candidates must be U.S. citizens or lawful permanent residents who plan full-time, uninterrupted study toward a Ph.D. at an accredited U.S. university.

The DOE CSGF is composed of two computational science tracks. Eligible fellowship candidates should carefully review the criteria for both tracks prior to initiating the application process:

  • The Science & Engineering Track accepts doctoral students engaged in computational science research with a science or engineering focus.
  • The Mathematics/Computer Science Track accepts students pursuing research in broadly applicable methods and technology for high-performance computing (HPC) systems.

Students applying to the Mathematics/Computer Science Track must be pursuing a doctoral degree in applied mathematics, statistics, computer science, computer engineering or computational science — in one of these departments or their academic equivalent. A departmental exception is made for students whose research is focused on algorithms or software for quantum information systems and who are enrolled in a science or engineering field. In all cases, research must contribute to more effective use of emerging HPC systems.

DOE NNSA Stewardship Science Graduate Fellowship- Applications are being accepted through January 14, 2025 for the Department of Energy National Nuclear Security Administration Stewardship Science Graduate Fellowship (DOE NNSA SSGF). Candidates must be U.S. citizens who plan full-time, uninterrupted study toward a Ph.D. at an accredited U.S. university.

The DOE NNSA SSGF provides doctoral students with in-depth training in areas relevant to stewardship of the nation's nuclear stockpile: high energy density physics, nuclear science, or materials under extreme conditions. Senior undergraduates and first- or second-year doctoral students are eligible to apply.

DOE NNSA Laboratory Residency Graduate Fellowship- As part of its science and national security missions, the U.S. Department of Energy National Nuclear Security Administration (DOE NNSA) supports a spectrum of basic and applied research in science and engineering at the agency's national laboratories, at universities and in industry.

Because of its continuing needs, the NNSA seeks candidates who demonstrate the skills and potential to form the next generation of leaders in the following fields via the DOE NNSA LRGF program:

  • Engineering and Applied Sciences
  • Physics
  • Materials
  • Mathematics and Computational Science To meet its primary objective of encouraging the training of scientists, the DOE NNSA LRGF program provides financial support to talented students who accompany their study and research with practical work experience at one or more of the following DOE NNSA facilities: Lawrence Livermore National Laboratory, Los Alamos National Laboratory, Sandia National Laboratories or the Nevada National Security Site.

NSF ACCESS Student Training and Engagement Program (STEP) - Applications due February 15.

  • STEP 1: 2-week introductory experience - Join a dynamic two-week experience in Miami, FL in May! Travel, accommodations, and stipend included. This program is designed to help you:

    • Become more aware of the many areas of interest within the field: web programming, high performance computing, cybersecurity, networking, real use of AI.
    • Determine where your interests are.
    • Discover what areas of study you should pursue in the future.
    • Experience high levels of direct interaction with diverse people with diverse skill sets.
  • STEP-2: Full-time for the summer (in-person + virtual) - follows STEP-1 Travel and accommodations covered for in-person events as well as a stipend for participating plus includes travel to a national conference. This program is designed to help you:

    • Develop an intermediate understanding of one of the areas of interest within the field: web programming, high performance computing, cybersecurity, networking, real use of AI.
    • Provide opportunities for interactions with professionals in the field.
  • STEP-3: Part-time during the school year, follows STEP-2. Continue interacting (virtually) with your team on projects on a part-time basis during the academic year. Participants will receive a stipend and travel to the annual Supercomputing conference (SC25 in St. Louis, MO, USA). This program is designed to help you:

    • Develop a deep understanding of one of the areas of interest within the field: web programming, high performance computing, cybersecurity, networking, real use of AI.
    • Provide further opportunities for interactions with professionals in the field.

Understand how a career in cyberinfrastructure could fit with your future plans!

External Trainings#

Looking for FREE training on cutting-edge digital technologies for industry? Find out more about the Training from the Hartree Centre HERE!

Whether you’re looking to get to grips with the basics or searching for new tools and techniques to apply, Hartree Training supports both self-directed online learning as well as face-to-face practical sessions with badge certification available for you to share your new skills with your network. Currently offering courses covering a range of advanced digital technologies including:

  • Data Science
  • Artificial Intelligence and Modelling
  • High Performance and Exascale Computing
  • Software Engineering
  • Emerging Technologies

To enroll, create a FREE account or log in. Once you have created an account, you can log in and sign up for as many of our courses as you like.

NVIDIA Professional Workshops and Certificates Blog post HERE discusses NVIDIA Professional Workshops ($3000 each) and Certificate Exams ($400 each) in AI Infrastructure and AI Operations. Includes links to workshops and certificate programs in Generative AI LLMs and Generative AI Multimodal.

Happy Computing,

Hyak Team

November 2024 Maintenance Details

Kristen Finch

Kristen Finch

HPC Staff Scientist

Our November maintenance is complete. Thank you for your patience while we make package updates to node images, ensuring the security and behavior you expect from Hyak klone.

The next maintenance will be Tuesday December 10, 2024.

Notable Updates#

  • We updated the GPU drivers to version 565.57.01 to patch security vulnerabilities CVE-2024-0117 through CVE-2024-0121 and CVE-2024-0126.
  • We migrated user Home directories to Solid State Drives (SSDs). Users may notice increased speed of software stored in Home directories.

Upcoming Training#

Hyak: Scheduling jobs with Slurm workshop is on Thursday, November 14th, from 10 a.m. to 12 p.m. in the WRF Data Science Studio (UW Physics/Astronomy Tower, 6th Floor; 3910 15th Ave NE, Seattle, WA 98195), and will cover Hyak’s job scheduler Slurm, interactive jobs, batch jobs, and array jobs. REGISTER HERE!

Hyak: Introduction to Deep Learning workshop is on Tuesday, December 3rd, from 10 to 11:50 a.m. in CSE2 (Gates Center) Room 371. Participants will start with a computer vision example, training a model on a sample dataset, and then learn how to execute the training process on Hyak. REGISTER HERE!

Office Hours#

Zoom office hours will be held on Wednesdays at 2pm. Attendees need only register once and can attend any of the occurrences with the Zoom link that will arrive via email. Click here to Register for Zoom Office Hours

In-person office hours will be held on Thursdays at 2pm at the eScience Institute (address: WRF Data Science Studio, UW Physics/Astronomy Tower, 6th Floor, 3910 15th Ave NE, Seattle, WA 98195).

The Research Computing Club will be holding office hours fall term. In-person office hours will be held at the eScience Institute, WRF Data Science Studio.

OfficerDateTime
Sam Shin19 Nov2pm
Teerth Mehta3 Dec2pm

If you would like to request 1 on 1 help, please send a ticket to help@uw.edu with "Hyak Office Hour" in the subject line to coordinate a meeting.

October 2024 Maintenance Details

Kristen Finch

Kristen Finch

HPC Staff Scientist

Our October maintenance is complete. Thank you for your patience while we make package updates to node images, ensuring the security and behavior you expect from Hyak klone.

The next maintenance will be Tuesday November 12, 2024.

New Tools Documentation#

Our research computing interns have been hard at work adding documentation for new user tools that might help optimize your research computing. Click the links below to review the docs for these tools.

Squash Fuse - SquashFS packages multiple small files into a single read-only, compressed filesystem, reducing metadata calls and improving performance. This minimizes server load, enhancing throughput and efficiency in handling storage requests.

Use case on Hyak: In HPC, datasets often consist of numerous small files, which can lead to performance bottlenecks due to excessive metadata operations. By utilizing SquashFS, HPC applications can significantly reduce metadata overhead, improving data access speeds and enhancing overall system performance, particularly in large-scale distributed storage systems.

Checkpointing with DMTCP - DMTCP is a tool to transparently checkpoint and restart jobs, saving it to disk to be resumed at a later time. It requires no changes to application code, allowing easy use. Using DMTCP with your code allows checkpointing at regular intervals so if your job is pre-empted or reaches the time limit, it will resume from its last checkpoint.

Use case on Hyak: DMTCP offers a solution for folks who would like to use Hyak's ckpt partitions, but have jobs that exceed the ckpt time limits of 5 hours for CPU-noly jobs and 8 hours for GPU-only jobs. Checkpointing with DMTCP facilitates efficient use of ckpt resources, allowing higher throughput for your jobs.

Tools for Kopah Storage Users - We have installed Command Line Interface tools like s3cmd and s5cmd on klone and provide insctructions for using Python library boto3 for Kopah interaction and retreival to build Kopah S3 storage usage into your research computing applciations on Hyak.

If you have any issue using these tools, please open a ticket by emailing help@uw.edu with "Hyak" in the subject line. We appreciate any feedback about how to improve ease of use for tools presented in our documentation.

Upcoming Training#

We have planned 2 Hyak-specific trainings for this Fall (more to come, stay tuned). These trainings will be held in person and will not be recorded since recorded materials are already publicly accessible. Capacity is limited, follow the links below to register today to guarantee your spot.

Hyak: Containers are your friend - Monday October 28 10am - 12pm.

Hyak: Scheduling Jobs with Slurm - Thursday November 14 10am - 12pm

In the first hour and a half, we will go over content. The last 30 minutes will be reserved for questions.

Location: eScience Classroom; WRF Data Science Studio, UW Physics/Astronomy Tower, 6th Floor; 3910 15th Ave NE, Seattle, WA 98195

Keep an eye on your inbox for updates about additional trainings this fall.

Fall Office Hours#

Hyak HPC Staff Scientist and Facilitator, Kristen Finch, will be holding office hours fall term.

Zoom office hours will be held on Wednesdays at 2pm. Attendees need only register once and can attend any of the occurrences with the Zoom link that will arrive via email.

Click here to Register for Zoom Office Hours

In-person office hours will be held on Thursdays at 2pm at the eScience Institute (address: WRF Data Science Studio, UW Physics/Astronomy Tower, 6th Floor, 3910 15th Ave NE, Seattle, WA 98195). Click here to RSVP for in-person Office Hours.

Click here to visit the eScience Office Hours page to see additional eScience office hours including AI/ML, R, Earth Data, and Python (not available to help with Homework).

The Research Computing Club will be holding office hours fall term. In-person office hours will be held at the eScience Institute (address: WRF Data Science Studio, UW Physics/Astronomy Tower, 6th Floor, 3910 15th Ave NE, Seattle, WA 98195).

OfficerDateTime
Brenden Pelkie16 Oct1pm
Nels Schimek23 Octipm
Nels Schimek6 Nov1pm
Sam Shin19 Nov2pm
Teerth Mehta3 Dec2pm

If you would like to request 1 on 1 help, please send a ticket to help@uw.edu with "Hyak Office Hour" in the subject line to coordinate a meeting with Kristen.

Please don't hesitate to reach out to the Hyak team with issues and feedback by opening a tickey by emailing help@uw.edu with "Hyak" in the subject.

Have a great October!

Happy computing,

Hyak Team

September 2024 Maintenance Details

Kristen Finch

Kristen Finch

HPC Staff Scientist

Thanks again for your patience with our monthly scheduled maintenance. During this maintenance session, we were able to provide package updates to node images to ensure compliance with the latest operating system level security fixes and performance optimizations. Of note, the Nvidia GPU driver on Klone has been updated to the latest production datacenter release, version 560.35.03.

The next maintenance will be Tuesday October 8, 2024.

AMD Math libraries#

In June we announced the addition of AMD Nodes and Slices to klone which make up our generation 2 or g2 collection of resources. Click here to read more about the difference between our g1 and g2 resources. During August, we installed a new AMD compiler suite, AOCC, along with specialized math libraries like AOCL, OpenBLAS, and ScaLAPACK as modules to make the most of this upgade. The new modules are useful on all partitions. The AOCC and AOCL modules are particularly relevant for partitions cpu-g2, cpu-g2-mem2x, and ckpt-g2. These tools are designed to optimize performance on AMD processors, speeding up complex mathematical computations. Whether you're working on simulations, data analysis, or any number-crunching tasks, these libraries may help ensure you get faster, more efficient results. If you're looking to boost your workflow, it's worth exploring how these libraries can benefit your projects. Here are the names of the new modules:

aocc/4.2.0
aocl/4.2.0
openblas/0.3.28
scalapack/2.2.0

In our benchmarking tests, performance of these libraries was similar on g1 and g2 CPUs for each math library, regardless of architecture. The best library performer on AMD CPUs is AOCC+AOCL, and for Intel CPUs it’s OpenBLAS+ScaLAPACK:

Image displays graph indicating that the best library performer on AMD CPUs is AOCC plus AOCL, and for Intel CPUs it’s OpenBLAS plus ScaLAPACK

Fall Office Hours#

Hyak HPC Staff Scientist and Facilitator, Kristen Finch, will be holding office hours fall term. Zoom office hours will be held on Wednesdays at 2pm. Attendees need only register once and can attend any of the occurrences with the Zoom link that will arrive via email.

Click here to Register

In-person office hours will be held on Thursdays at 2pm at the eScience Institute (address: WRF Data Science Studio, UW Physics/Astronomy Tower, 6th Floor, 3910 15th Ave NE, Seattle, WA 98195). Click here to RSVP for in-person Office Hours.

Click here to visit the eScience Office Hours page to see additional eScience office hours including AI/ML, R, Earth Data, and Python (not available to help with Homework).

If you would like to request 1 on 1 help, please send a ticket to help@uw.edu with "Hyak Office Hour" in the subject line to coordinate a meeting with Kristen.

August 2024 Training Videos#

In case you missed it, we recorded the August 2024 Wednesday training sessions and posted them on the UW-IT YouTube channel under the playlist, "Hyak Training." Here are the links:

Keep an eye on your indox for updates about our Fall training schedule; training sessions are currently TBA. Trainings will be announced via the Hyak mailing list, click here to join the mailing list.

August 2024 Maintenance Details

Kristen Finch

Kristen Finch

HPC Staff Scientist

Thanks again for your patience with our monthly scheduled maintenance. During this maintenance session, we were able to provide package updates to node images to ensure compliance with the latest operating system level security fixes and performance optimizations.

The next maintenance will be Tuesday September 10, 2024.

New self-hosted S3 storage option: KOPAH#

We are happy to announce the preview launch of our self-hosted S3 storage called KOPAH. S3 storage is a solution for securely storing and managing large amounts of data, whether for personal use or research computing. It works like an online storage locker where you store can files of any size, accessible from anywhere with an internet connection. For researchers and those involved in data-driven studies, it provides a reliable and scalable platform to store, access, and analyze large datasets, supporting high-performance computing tasks and complex workflows.

S3 uses buckets as containers to store data, where each bucket can hold 100,000,000 objects, which are the actual files or data you store. Each object within a bucket is identified by a unique key, making it easy to organize and retrieve your data efficiently. Public links can be generated for KOPAH objects so that users can share buckets and objects with collaborators.

Click here to learn more about KOPAH S3.

Who should use KOPAH?#

KOPAH is a storage solution for anyone. Just like other storage options out there, you can upload, download, and view your storage bucket with specialized tools and share your data via the internet. For Hyak users, KOPAH provides another storage option for research computing. It is more affordable than /gscratch storage and can be used for active research computing with a few added steps for retreiving stored data prior to a job.

Test Users Wanted#

Prior to September, we are inviting test users to try KOPAH and provide feedback about their experience. If you are interested in becoming a KOPAH test user, please each help@uw.edu with Hyak or KOPAH in the subject line.

Requirements:

  1. While we will not charge for the service until September 1, to sign up as a test user, we require a budget number and worktag. If the service doesn't work for you, you can cancel before September.
  2. We will ask for a name for the account. If your groups has an existing account on Hyak, klone /gscratch, it makes sense of the names to match across services.
  3. Please be ready to respond with your feedback about the service.

Opportunities#

PhD students should check out this opportunity for funding from NVIDIA: Graduate Research Fellowship Program

Questions? If you have any questions for us, please reach out to the team by emailing help@uw.edu with Hyak in the subject line.