연구 분야: Artificial Intelligence
학회: MIDA '24: Proceedings of the 2024 International Conference on Machine Intelligence and Digital Applications
In the medical field, an increasing amount of electronic medical record data is being generated and stored, containing a wealth of valuable information. In response to the difficulties in semantic understanding and data sparsity in current electronic medical record data processing, this article implemented data mining and processing based on Natural Language Processing (NLP) technology. Firstly, Jieba was used for text segmentation, and Named Entity Recognition (NER) was expanded using the BERT (Bidirectional Encoder Representations from Transformers) model. Then, the RoBERTa (Robustly optimized BERT approach) model was used to extract entity relationships, and text classification was implemented based on CNN (Convolutional Neural Network). Finally, key information was extracted from the classified text using regular expressions. The research results indicated that the accuracy of the model in extracting electronic medical record data content reached 0.98, showing a 6.5% improvement compared to Random Forest (RF). Meanwhile, the recall rate reached 0.97. The F1 score reached 0.97, with a standard deviation of only 0.01 between different folds, indicating that the model possesses high reliability and generalization capability. The method used can accurately and efficiently extract key content such as patient medical history and diagnostic information.
| 발행 연도 | 2024년 |
|---|---|
| 인용수 | 0 |
| 출판 국가 | China |
| 사이트 | ACM |
| 좋아요 수 | 0 |