Unlocking the Potential of Containers in Scientific Computing to Achieve Bitwise Reproducibility, Portability and Performance


연구 분야: Software Development



학회: Nordic e-Infrastructure Collaboration Conference


초록

The modern form of containers, as popularly known through platforms like Docker, Singularity (Apptainer), Podman or Charliecloud, to cite only a few, began to take shape over a decade ago. However, the fundamentals behind containerization and actual benefits of containers in scientific computing remain largely unclear to a vast majority of users. In fact, there is a significant gap between simplistic “Hello world” examples found online and real scientific applications. Also, rumors suggest that achieving satisfactory performances on supercomputers across multiple nodes is impossible. The aim of this paper is to explain how containers can leverage the potential of high-speed networks for inter-node communications with UCX on Fram and Betzy (from the Norwegian national e-infrastructure provider). It is also shown how to achieve near-native performance on LUMI (EuroHPC’s flagship) despite a “Slingshot-11” interconnect and proprietary library. Results obtained in the standard OSU Micro-Benchmarks tests for latency and bandwidth, and with a fully-fledged climate model, demonstrate that containerized applications work just as well as their bare-metal counterparts, are portable and provide bit-for-bit reproducibility on different platforms. Containers are therefore highly recommended to minimize deployment and porting issues i) for AaaS (Applications as a Service) coming with all the necessary software environment (rather than source code only); and ii) so that HPC users do not have to rely on anybody to install what they need and can be operational within minutes whilst still getting top performance.


Author Profile
Jean Iaquinta

Norwegian Research Infrastructure Services Oslo Norway

Norway
Author Profile
Anne Fouilloux

Simula Research Laboratory Oslo Norway

Norway

📄 논문 정보

발행 연도 2025년
인용수 0
출판 국가 Norway
사이트 Springer
좋아요 수 0

연관 논문 목록 (31건)