연구 분야: Verification
학회: SAC '23: Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing
In the days of AI, data-centric machine learning (ML) models are increasingly used in various complex systems. While many researchers are focusing on specifying ML-specific performance requirements, not enough guideline is provided to engineer the data requirements systematically involving diverse stakeholders. Lack of written agreement about the training data, collaboration bottlenecks, lack of data validation framework, etc. are posing new challenges to ensuring training data fitness for safety-critical ML components. To reduce these gaps, we propose a multi-layered framework that helps to perceive and elicit data requirements. We provide a template for verifiable data requirements specifications. Moreover, we show how such requirements can facilitate an evidence-driven assessment of the training data quality based on the experts' judgments about the satisfaction of the requirements. We use Dempster Shafer's theory to combine experts' subjective opinions in the process. A preliminary case study on the CityPersons dataset for the pedestrian detection feature of autonomous cars shows the usefulness of the proposed framework for data requirements understanding and the confidence assessment of the dataset.
| 발행 연도 | 2023년 |
|---|---|
| 인용수 | 4 |
| 출판 국가 | Andorra, Korea |
| 사이트 | ACM |
| 좋아요 수 | 0 |