연구 분야: Software Development
학회: 2024 IEEE High Performance Extreme Computing Conference (HPEC)
We present Syndeo: a software framework for container orchestration of Ray on Slurm. In general the idea behind Syndeo is to write code once and deploy anywhere. Specifically, Syndeo is designed to addresses the issues of portability, scalability, and security for parallel computing. The design is portable because the containerized Ray code can be re-deployed on Amazon Web Services, Microsoft Azure, Google Cloud, or Alibaba Cloud. The process is scalable because we optimize for multi-node, high-throughput computing. The process is secure because users are forced to operate with unprivileged profiles meaning administrators control the access permissions. We demonstrate Syndeo's portable, scalable, and secure design by deploying containerized parallel workflows on Slurm for which Ray does not officially support.11DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited. This material is based upon work supported by the Department of the Air Force under Air Force Contract No. FA8702-15-D-000 1. Any opinions findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Department of the Air Force.
| 발행 연도 | 2024년 |
|---|---|
| 인용수 | 19 |
| 출판 국가 | Morocco |
| 사이트 | IEEE |
| 좋아요 수 | 0 |