Data Enhancement for Binary Classification of Relational Data


연구 분야: Infrastructure



학회: Proceedings of the ACM on Management of Data, Volume 3, Issue 3


초록

This paper studies enhancement of training data D to improve the robustness of machine learning (ML) classifiers M against adversarial attacks on relational data. Data enhancing aims to (a) defuse poisoned imperceptible features embedded in D, and (b) defend against attacks at prediction time that are unseen in D. We show that while there exists an inherent tradeoff between the accuracy and robustness of M in case (b), data enhancing can improve both the accuracy and robustness at the same time in case (a). We formulate two data enhancing problems accordingly, and show that both problems are intractable.Despite the hardness, we propose a framework that integrates model training and data enhancing. Moreover, we develop algorithms for (a) detecting and debugging corrupted imperceptible features in training data, and (b) selecting and adding adversarial examples to training data to defend against unseen attacks at prediction time. Using real-life datasets, we empirically verify that the method is at least 20.4% more robust and 2.02X faster than SOTA methods for classifiers M, without degrading the accuracy of M.


Author Profile
Wenfei Fan

Shenzhen Institute of Computing Sciences Shenzhen China University of Edinburgh Edinburgh United Kingdom and Beihang University Beijing China

Andorra
Author Profile
Xiaoyu Han

Fudan University Shanghai China

China
Author Profile
Weilong Ren

Shenzhen Institute of Computing Sciences Shenzhen China

China

📄 논문 정보

발행 연도 2025년
인용수 0
출판 국가 Andorra, China
사이트 ACM
좋아요 수 0

연관 논문 목록 (184건)