Scientific Computing for RSE
RSEs with expertise in HPC and other performance-critical computing domains specialize in optimizing code for efficient execution across various platforms, including clusters, cloud, edge, and embedded systems. They understand parallel programming models, hardware-specific optimizations, profiling tools, and platform constraints such as memory, energy, and latency. Their skills enable them to adapt software to diverse infrastructures, manage complex dependencies, and support researchers in accessing and using advanced computing resources effectively and sustainably.
Module Overview
This module provides an entry‑level yet rigorous foundation in scientific computing for graduate students and researchers who need to design, implement, and evaluate computational experiments. Learners gain an awareness of the numerical underpinnings of modern simulation and data‑driven research, with an emphasis on writing reproducible, efficient, and trustworthy code.
Intended Learning Outcomes
By the end of the module participants will be able to
- Benchmark small programs and interpret performance metrics in a research context.
- Explain how approximation theory and floating‑point arithmetic affect numerical accuracy and stability.
- Identify when to use established simulation libraries (e.g. BLAS/LAPACK, PETSc, Trilinos) instead of custom code.
- Write simple GPU kernels and describe the core principles of accelerator programming.
- Submit and monitor batch & array jobs on a mid‑size compute cluster.
- Describe common HPC challenges—such as I/O bottlenecks, threading, and NUMA—and propose mitigation strategies.
- Maintain research software through continuous benchmarking.
The module qualifies for the more advanced module Scientific (High-Performance) Computing.
Syllabus (Indicative Content)
| Week | Theme | Topics |
|---|---|---|
| 1 | Benchmarking & Profiling | Timing strategies · micro vs. macro benchmarks · tooling overview |
| 2 | Precision & Approximation | IEEE‑754 recap · conditioning & stability · error propagation |
| 3 | Scientific Libraries | BLAS/LAPACK anatomy · hierarchical I/O libraries · overview of PETSc/Trilinos/Hypre |
| 4 | GPU Primer | Kernel model · memory hierarchy · CUDA/OpenCL/PyTorch lightning intro |
| 5 | Working on a Cluster | Slurm basics · job arrays · job dependencies · simple Bash launchers |
| 6 | HPC Pitfalls | I/O throughput · thread oversubscription · NUMA awareness |
| 7 | Software Maintenance | Regression + performance tests · continuous benchmarking pipelines |
Teaching & Learning Methods
Short lectures (30%) are coupled with hands‑on labs (70%). Students complete weekly notebooks and a mini‑project that reproduces and optimises a published computational result.
Assessment
| Component | Weight | Details |
|---|---|---|
| Continuous labs | 40% | Weekly graded notebooks |
| Final mini‑project | 60% | Report, code, and benchmark suite |
Prerequisites
- Basic programming in Python, C/C++, or Julia
- Undergraduate calculus & linear algebra