연구 분야: Software Development
학회: European Conference on Advances in Databases and Information Systems
Experiment management systems (EMSs), such as MLflow, are increasingly used to streamline the collection and management of machine learning (ML) artifacts in iterative and exploratory ML experiment workflows. However, EMSs typically suffer from limited provenance capabilities rendering it hard to analyze the provenance of ML artifacts and gain knowledge for improving experiment pipelines. In this paper, we propose a comprehensive provenance model compliant with the W3C PROV standard, which captures the provenance of ML experiment pipelines and their artifacts related to Git and MLflow activities. Moreover, we present the tool MLFLOW2PROV that extracts provenance graphs according to our model from existing projects enabling collected pipeline provenance information to be queried, analyzed, and further processed.
| 발행 연도 | 2023년 |
|---|---|
| 인용수 | 0 |
| 출판 국가 | Germany |
| 사이트 | Springer |
| 좋아요 수 | 0 |