Named Entity Recognition for Performance and Synthesis Information of Perovskite Solar Cells Using SpaCy


연구 분야: Databases



학회: International Conference on Computational Science and Its Applications


초록

This paper presents a Named Entity Recognition (NER) approach to extract key information related to the structural components, synthesis techniques, and photovoltaic performance metrics of Perovskite solar cells (PSCs). Since NER is a key component of Information Extraction (IE), which identifies and classifies key elements of a text, we propose the use of the SpaCy library to build an effective and accessible NER model, suitable for environments with limited computational capacity, unlike other previous works in this field, which make use of large-scale models or high computational resources. Our resulting model was evaluated using K-fold cross-validation, obtaining the mean scores of, precision of 89.94%, a recall of 92.47%, and an F1 score of 89.69%. To provide a test of the practical performance of the resulting model, implementing and comparing the obtained results with manual annotations in two Excel reference databases: Odabaşı (2019) [1] and Jacobsson (2022) [2], demonstrating the potential of our work to facilitate and accelerate knowledge extraction, and the possibility of extending this strategy to other scientific fields where automated text extraction is required.


Author Profile
Song Min Gyu

ANYANG University (AYU) Korea

Korea
Author Profile
Mary Zuleika Jiménez-Díaz

Escuela de Ingenierías Eléctrica Electrónica y de Telecomunicaciones Universidad Industrial de Santander Bucaramanga Santander Colombia

Colombia
Author Profile
Alexander Sepúlveda-Sepúlveda

Escuela de Ingenierías Eléctrica Electrónica y de Telecomunicaciones Universidad Industrial de Santander Bucaramanga Santander Colombia

Colombia

📄 논문 정보

발행 연도 2025년
인용수 0
출판 국가 Colombia, Korea
사이트 Springer
좋아요 수 0

연관 논문 목록 (56건)