Evidence Weighted Tree Ensembles for Text Classification


연구 분야: Infrastructure



학회: SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval


초록

Text documents are often mapped to vectors of binary values where 1 indicates the presence of a word and 0 indicates the absence. The vectors are then used to train predictive models. In tree-based ensemble models, predictions from some decision trees may be made purely from absent words. This type of predictions should be trusted less as absent words can be interpreted in multiple ways. In this work, we propose to improve the comprehensibility and accuracy of ensemble models by distinguishing word presence and absence. The presented method weights predictions based on word presence. Experimental results on 35 real text datasets indicate that our method outperforms state-of-the-art ensemble methods on various text classification tasks.


Author Profile
Md Zahidul Islam

University of South Australia Adelaide Australia

Australia
Author Profile
Jixue Liu

University of South Australia Adelaide Australia

Australia
Author Profile
Jiuyong Li

University of South Australia Adelaide Australia

Australia

📄 논문 정보

발행 연도 2020년
인용수 1
출판 국가 Australia
사이트 ACM
좋아요 수 0

연관 논문 목록 (1건)