A Method for Extracting Information from Long Documents that Combines Large Language Models with Natural Language Understanding Techniques


연구 분야: Artificial Intelligence



학회: 2023 4th International Conference on Computer, Big Data and Artificial Intelligence (ICCBD+AI)


초록

Information extraction is a very important task in natural language processing and is widely used in various industries, but due to the ever-changing types of documents, there are still challenges in terms of extraction effectiveness in practical applications. In previous research, traditional AI models were commonly used to address the issue of information extraction from long documents. However, due to challenges related to input length and the capability to understand the semantics of extended texts, the results have consistently fallen short of expectations and have not been practically implemented in industrial production.In this paper, we propose a solution for extracting information from long document based on large language model. While possessing the ability of long-text semantic understanding of large language models, it also effectively alleviates the hallucinations of large language models and the problem of uncontrollable output results through discriminative methods. Compared with other general methods, our approach has achieved significant performance improvement on the long-text dataset we collected. Additionally, our approach is reproducible and can be easily and quickly customized and adapted to other scenarios.


Author Profile
Linjie Chen

China Mobile Information Technology Company Limited Shenzhen China

China
Author Profile
Min Sun

China Mobile Information Technology Company Limited Shenzhen China

China
Author Profile
Jiarong Liu

China Mobile Information Technology Company Limited Shenzhen China

China

📄 논문 정보

발행 연도 2023년
인용수 1
출판 국가 China
사이트 IEEE
좋아요 수 0

연관 논문 목록 (27건)