A Research Infrastructure for science with extreme-scale data and computing needs
Alps is a general-purpose compute and data Research Infrastructure (RI) open to the broad community of researchers in Switzerland and the rest of the world. Alps will provide a high impact, challenging and innovative RI that will allow Switzerland to advance science and impact society.
Alps enables the creation of versatile clusters (vClusters) that can be tailored to the specific needs of users while maintaining confidentiality. For example, a vCluster will be dedicated to MeteoSwiss’ numerical weather forecasts, another one to the User Lab and another one to Machine Learning and Artificial Intelligence.
Alps is geo-distributed over different sites. This allows for example to provide geographically redundant supercomputing services or to access large amounts of data stored in different locations.
Alps is currently housed at the following locations:
- CSCS in Lugano
- EPFL in Lausanne
- Paul Scherrer Institute (PSI) in Villingen for data archiving.
- ECMWF in Bologna for access to meteorological data.
#weareALPS
System Specification
Overview
Model | HPE Cray EX |
Interconnect | HPC Cray Slingshot-11 with 200 Gbps injection bandwidth per module / GPU |
Scratch disk | 100 + 10 PB on hard disk 5 + 1 PB on Solid State Disk (SSD) |
Data archive and backup | 2 x 130 PB tape libraries |
Nodes Overview
# of nodes | # of sockets per node | Total # of sockets | Processor(s) | Specifications | TFlops |
2,688 | 4 | 10,752 | NVIDIA Grace-Hopper | 72 ARM cores, 128 GB LPDDR 5X RAM, H100 GPU with 96 GB HBM3 memory | n/a |
1,024 | 2 | 2,048 | AMD EPYC 7742 CPU (Rome) | 2x64 cores, 256/512 GB DDR RAM | 4,719 |
144 | 1 CPU + 4 GPU | 720 | AMD EPYC host CPU + NVIDIA A100 | 64 cores, 128 GB DDR RAM) and 4 NVIDIA A100-96/80 (96/80 GB HBM2E | n/a |
128 | 4 | 512 | AMD Mi300A CPU+GPU | n/a | |
24 | 1 | 24 | AMD EPYC host CPU + AMD Mi250X GPU | 64 cores, 128 GB RAM) and 4 AMD Mi250X GPU | n/a |
Installation & Upgrade History
January - June 2024
Stepwise installation of the 2,688 nodes with NVIDIA Grace-Hopper processors for a total of 10,752 sockets.
April 2024
To enhance the capability and availability of its Research Infrastructure, CSCS is collaborating with EPFL to extend its extreme-scale computing and data infrastructure Alps to the campus of EPFL. This extension of Alps will be available as failover for the Federal Office of Meteorology and Climatology MeteoSwiss service in spring 2024.
Read the press release >
October 2023
Extension of the scratch disk with 5 + 1 PB of Solid State Disk (SSD).
July 2023
Extention of the scratch disk to 100 PB based on a Clusterstor storage system with 8480 x 16TB Hard Disks. The usable capacity is 101.082 TB with 1 TB/s throughput. The connection with the compute nodes is based on HPE Slingshot interconnect.
Activation of the HPE Data Management Framework enabling the data movement between different storage tiers.
March 2023
Addition of a small staging system for the Cray Management System (CMS) composed by Master nodes, Storage Nodes, 5 Worker nodes, and 2 Compute nodes.
April 2021
CSCS, Hewlett Packard Enterprise and NVIDIA Announce World’s Most Powerful AI-Capable Supercomputer.
Read the press release >
October 2020
Installation of the first cabinets and compute modules of the Alps Research Infrastructure. This phase comprises of 1'024 compute nodes based on the HPE Cray supercomputing architecture with Slingshot interconnect, and HPE ClusterStor storage system. Each node is equipped with two AMD EPYC(TM) 7742 64-Core processors. The scratch storage capacity is 10 PetaBytes.