Benchmarking Large Language Models for Log Analysis, Security, and Interpretation


연구 분야: Analysis



학회: Journal of Network and Systems Management


초록

Large Language Models (LLM) continue to demonstrate their utility in a variety of emergent capabilities in different fields. An area that could benefit from effective language understanding in cybersecurity is the analysis of log files. This work explores LLMs with different architectures (BERT, RoBERTa, DistilRoBERTa, GPT-2, and GPT-Neo) that are benchmarked for their capacity to better analyze application and system log files for security. Specifically, 60 fine-tuned language models for log analysis are deployed and benchmarked. The resulting models demonstrate that they can be used to perform log analysis effectively with fine-tuning being particularly important for appropriate domain adaptation to specific log types. The best-performing fine-tuned sequence classification model (DistilRoBERTa) outperforms the current state-of-the-art; with an average F1-Score of 0.998 across six datasets from both web application and system log sources. To achieve this, we propose and implement a new experimentation pipeline (LLM4Sec) which leverages LLMs for log analysis experimentation, evaluation, and analysis.


Author Profile
Egil Karlsen

Faculty of Computer Science Dalhousie University University Ave. Halifax NS B3H 1W5 Canada

Canada
Author Profile
Xiao Luo

Department of Management Science and Information Systems Oklahoma State University 370 Business Building Stillwater OK 74078 USA

Andorra
Author Profile
Nur Zincir-Heywood

Faculty of Computer Science Dalhousie University University Ave. Halifax NS B3H 1W5 Canada

Canada

📄 논문 정보

발행 연도 2024년
인용수 0
출판 국가 Andorra, Canada
사이트 Springer
좋아요 수 0

연관 논문 목록 (337건)