CSCS organized May 15-17 a 2-day training course aimed to teach performance engineering approaches on the compute node level. "Performance engineering" is intended as developing a thorough understanding of the interactions between software and hardware. The instructors were Prof. Gerhard Wellen and Dr. Georg Hager from RRZE, Germany.
Introduction to Node-level Performance Engineering »
- Modern computer architectures
- Performance modeling & engineering approaches
- Accelerators
Node topology and programming models »
- Performance modeling
- Thread/Process core affinity
- The LIKWID tools
Micro-benchmarking for architectural exploration »
- The LIKWID tools: online demos
- Micro-benchmarking for architectural exploration
- Understanding the memory hierarchy
- Case study: OpenMP sparse matrix-vector multiplication
Performance modeling: The Roofline Model / Case study A 3D Jacobi smoother (part 1) »
Performance modeling: The Roofline Model / Case study A 3D Jacobi smoother (part 2)»
Understanding the memory hierarchy: Cache Mapping (part 1) »
- Optimal utilization of parallel resources
- Reading x86 assembly code and exploiting SIMD parallelism (part 1)
Understanding the memory hierarchy: Cache Mapping (part 2) »
- Reading x86 assembly code and exploiting SIMD parallelism (part 2)
- Performance analysis with hardware metrics
- Online demo: likwid-perfctr
Multicore Scaling: the ECM model »
- Performance modeling of Stencil Codes
- Examples and case studies