Using Chao’s Estimator as a Stopping Criterion for Technology-Assisted Review


연구 분야: Verification



학회: ACM Transactions on Information Systems, Volume 43, Issue 3


초록

Technology-Assisted Review aims to reduce the human effort required for screening processes such as abstract screening for Systematic Literature Reviews. Human reviewers label documents as relevant or irrelevant during this process, while the system incrementally updates a prediction model based on the reviewers’ previous decisions. After each model update, the system proposes new documents it deems relevant, to prioritize relevant documents over irrelevant ones. A stopping criterion is necessary to guide users in stopping the review process to minimize the number of missed relevant documents and the number of read irrelevant documents. In this article, we propose and evaluate a new ensemble-based Active Learning strategy and a stopping criterion based on Chao’s Population Size Estimator that estimates the prevalence of relevant documents in the dataset. Our simulation study demonstrates that this criterion performs well on several datasets and is compared to other methods presented in the literature.


Author Profile
Michiel Bron

Department of Information and Computing Sciences Faculty of Science Utrecht University Utrecht The Netherlands and The Netherlands National Police Den Haag (The Hague) The Netherlands

Andorra
Author Profile
Peter G van der Heijden

Department of Methods and Statistics Faculty of Social Sciences Utrecht University Utrecht The Netherlands and University of Southampton Southampton United Kingdom of Great Britain and Northern Ireland

Andorra
Author Profile
Ad J Feelders

Department of Information and Computing Sciences Faculty of Science Utrecht University Utrecht The Netherlands

Andorra

📄 논문 정보

발행 연도 2025년
인용수 1
출판 국가 Andorra
사이트 ACM
좋아요 수 0

연관 논문 목록 (4건)