September 24, 2021 - by Sarah Waldrip
Sarus is a new container software developed by CSCS staff especially for deploying container images in HPC environments. Technically defined as a “container engine”, its core design is consistent with the industry open standard from leading non-HPC container engines such as Docker. However, it offers greater flexibility and portability than its predecessor across a wide range of HPC architectures; and it achieves this with near-native performance.
FirecREST is a novel programmatic interface designed for application developers. Also developed by CSCS staff FirecREST’s primary use case since development has been the SELVEDAS project (Services for Large Volume Experiment-Data Analysis utilising Supercomputing and Cloud technologies), a joint-undertaking with the Paul Scherrer Institute (PSI) supported by Swissuniversities grants. Although FirecREST is only now entering the market for wider use, it has already caught the attention of HPC centres around the world. CSCS CTO deputy and EuroCC NCC Switzerland project leader Maxime Martinasso first presented the software in late 2019 at The National Energy Research Scientific Computing Center (NERSC) in Berkeley, California, which inspired NERSC to use FirecREST as a reference for their own facilities.
The two products are already in use at CSCS, so we reached out to Martinasso and his deputy project leader, CSCS service and business manager Pablo Fernandez, for more information and insight into the motivation, development, dissemination, and usage of Sarus and FirecREST.
What is the purpose of the new website?
Maxime Martinasso: We recently had the opportunity to be a part of the EuroCC project. In this project, we set the goals for a National Competence Centre in Switzerland, and one of them was to better disseminate the software products we develop. The idea is to “productize” the software — because it has so far only been developed and used in-house at CSCS. We want to make our products more accessible to others, so we must also build out the level of organization to include the business, legal and marketing aspects as well. This new website is part of the marketing outreach effort.
How accessible are these products for the majority of HPC centres?
Pablo Fernandez: Both tools would be suitable for any HPC data centre, and they are equally necessary. I think a key difference, however, is that Sarus is quite standardized and works almost out-of-the-box, whereas FirecREST needs more components to work in the background. These components are typically present in other data centres across the world, but they may differ in some subtle way. For example, the scheduler we use is Slurm, but if they use another scheduler, they will have to do some extra work to adapt FirecREST.
How important was standardization in the development of these products?
MM: It was in our minds for a long time. When we decided to create Sarus, we were also looking at the Open Container Initiative (OCI), a consortium of many companies, with the goal of providing a standard for all container engines for the Cloud — no HPC. For context, the standard for containers originates from Docker, the most widely used container engine. In fact, Docker became so important for so many companies, it was decided that a standard would be made from it. In the HPC world, the community was not very interested in this standard, but we decided to use it because it is important for vendors like NVIDIA, HPE, AMD and so on. The standardization allows a vendor to provide a piece of code to their customers (a “hook”) that will tailor the Sarus container to the specific hardware, integrating it easily. This is also good for HPC centres because they no longer have to maintain that code.
PF: Open standards are not exactly a part of user-facing interfaces, so users will not necessarily see the difference as clearly as those who work within data centres. But in a way, this is still great for users, not only because it resembles Docker and is therefore more familiar, but also because they get access to these capabilities more easily when providers like us can offer it more readily. If it were not for the standards, it would be much more difficult for us and others to adopt this software.
MM: In other words, it does not matter what type of GPU you are using, for instance, because the vendor will provide the hook for Sarus to work on their hardware. It has flexibility above all.
How impactful is it that Sarus limits performance loss?
PF: One of the things that people are worried about when they are using containers is, ‘Am I going to lose performance?’ This is sometimes unavoidable, because the application code must go through additional layers to work; but it is really improved from before, as Sarus limits that loss to only a couple of percentage points. This ultimately increases the value of containerizing and also improves portability.
MM: It’s important to note that in some cases, containers can give exactly the same performance, or even in some cases better performance.
Who will benefit most from Sarus’s capabilities?
MM: Sarus makes it easier for users to reproduce experiments and create portable applications. You build your container once, and then you can use it on different HPC environments, with exactly the same code and exactly the same setup, and still get native performance.
PF: Sarus also helps with shipping and managing the containerized applications into the data centres. But it is not just about portability — it is also about the ease of deployment of their applications.
MM: Yes, because it makes the containers independent regarding what is installed inside the system without losing performance. Before, users had to rebuild the container image for every new, different data system they entered. This frequently led to lots of problems Now, because you decide what’s inside the container, the user can select the software stack inside the container image, use the container, and ultimately still get great performance.
Who will benefit most from FirecREST’s capabilities?
PF: FirecREST is primarily targeting the developers of scientific workflows and portals, but I would say, actually, that everyone benefits from it. It enables developers to more easily create new applications and services that target HPC resources while also making their applications more user-friendly and accessible. These services are in turn supported by our HPC services. It is essentially democratizing access to this powerful resource by making it simple to use for so many people. Especially since the birth of “the Cloud”, there has been a lot of focus on making IT less of a burden for users, and FirecREST allows application developers to do just that for HPC.
What kind of support will CSCS offer to users of Sarus and FirecREST?
PF: Again, our role in the EuroCC project is to make this software more accessible to more centres like ours. Together with the software products, we offer product support and answer requests, as well as address bugs that arise from installing and configuring the products at an HPC centre. When the products will be widely used, one possible evolution of the support model is to create a community among HPC centres offering a service based on the products. Such a community can then drive the product development while prioritizing fixes and new features.
MM: This is why we have built the website — to enable HPC providers to easily understand the capabilities of the products. If a person is interested in deploying and using the products, then please do not hesitate to contact the team members for additional needs.