From Regex to Encoders: Low-Effort Negation Detection for Healthcare Data Integration in Italian Hospitals


연구 분야: Databases



학회: International Conference on Artificial Intelligence in Medicine


초록

Negation detection is a critical component in extracting clinical insights from unstructured texts, especially within ETL (Extract, Transform, Load) pipelines for healthcare data integration. Traditional regex-based approaches, while effective, demand substantial effort, including extensive data annotation and iterative feedback from clinicians to craft and maintain domain-specific rules. In contrast, this paper introduces an encoder-based method for negation assertion detection in Italian clinical texts, achieving comparable performance while significantly reducing the need for clinical input. By leveraging pre-trained transformers and neural translation on public data for domain alignment and language localization, our approach offers a low-effort alternative to clinical negation detection on real-world, unseen entities from multiple Italian hospitals, maintaining high accuracy. These findings suggest an advanced maturity of encoder-based methods that can be leveraged by medical institutions to reduce development overhead, offering a scalable alternative to regular expressions for fundamental text processing bricks of clinical ETL workflows.


Author Profile
Tommaso Mario Buonocore

Department of Electrical Computer and Biomedical Engineering University of Pavia 27100 Pavia Italy

Andorra
Author Profile
Sonia Mognaschi

Biomeris SRL 27100 Pavia Italy

Italy
Author Profile
Lorenzo Beretta

Biomeris SRL 27100 Pavia Italy

Italy

📄 논문 정보

발행 연도 2025년
인용수 0
출판 국가 Italy, Andorra, Canada
사이트 Springer
좋아요 수 0

연관 논문 목록 (60건)