Efficient and robust active learning methods for interactive database exploration


연구 분야: Databases



학회: The VLDB Journal


초록

There is an increasing gap between fast growth of data and the limited human ability to comprehend data. Consequently, there has been a growing demand of data management tools that can bridge this gap and help the user retrieve high-value content from data more effectively. In this work, we propose an interactive data exploration system as a new database service, using an approach called “explore-by-example.” Our new system is designed to assist the user in performing highly effective data exploration while reducing the human effort in the process. We cast the explore-by-example problem in a principled “active learning” framework. However, traditional active learning suffers from two fundamental limitations: slow convergence and lack of robustness under label noise. To overcome the slow convergence and label noise problems, we bring the properties of important classes of database queries to bear on the design of new algorithms and optimizations for active learning-based database exploration. Evaluation results using real-world datasets and user interest patterns show that our new system, both in the noise-free case and in the label noise case, significantly outperforms state-of-the-art active learning techniques and data exploration systems in accuracy while achieving the desired efficiency for interactive data exploration.


Author Profile
Enhui Huang

École Polytechnique Palaiseau France

France
Author Profile
Yanlei Diao

École Polytechnique Palaiseau France

France
Author Profile
Anna Liu

University of Massachusetts Amherst Amherst USA

United States

📄 논문 정보

발행 연도 2023년
인용수 0
출판 국가 United States, France
사이트 Springer
좋아요 수 0

연관 논문 목록 (187건)