Leveraging Domain-Specific Databases for Seq2Seq-Based Relation Extraction from Materials Science Texts


연구 분야: Databases



학회: International Conference on Human-Computer Interaction


초록

This paper presents a novel approach to extract relations from materials science texts by leveraging domain-specific databases. We introduce a method that combines sequence-to-sequence (seq2seq) language models with knowledge graph embeddings derived from the Materials Project database. Our approach first constructs a comprehensive materials knowledge graph incorporating various properties such as element composition, magnetic ordering, and related materials. We then train knowledge graph embeddings using the translation-based methods. The resulting embeddings are integrated into a seq2seq-based relation extraction model through special knowledge graph tokens. When evaluated on the Materials Science Procedural Text Corpus, our method achieves state-of-the-art performance with a micro-averaged F1-score of 61.87%, representing a 2.63-point improvement over the baseline Flan T5-large model. This work demonstrates the effectiveness of incorporating domain-specific database information for enhancing relation extraction from materials science literature.


Author Profile
Masaki Asada

National Institute of Advanced Industrial Science and Technology (AIST) Tokyo Japan

Andorra
Author Profile
Ken Fukuda

National Institute of Advanced Industrial Science and Technology (AIST) Tokyo Japan

Andorra

📄 논문 정보

발행 연도 2025년
인용수 0
출판 국가 Andorra
사이트 Springer
좋아요 수 0

연관 논문 목록 (430건)