연구 분야: Strategies
학회: 2025 International Russian Smart Industry Conference (SmartIndustryCon)
This study addresses the problem of identifying vulnerabilities in open-source code by exploring existing methods of code analysis and evaluating the feasibility of automating security assessments using large language models (LLMs). We propose an approach for constructing high-quality datasets to fine-tune LLMs for software security analysis. Our methodology involves collecting and processing vulnerability data, filtering and curating security-related code changes, and structuring datasets to optimize model fine-tuning. We present an algorithm for aggregating vulnerability data sources and constructing a dataset specifically for training security-focused LLMs. To validate our approach, we fine-tune models from the Qwen family for software vulnerability detection in Python codebases during development and testing. Our findings demonstrate that the proposed method enables the development of intelligent, continuously adaptable AI agents capable of identifying and analyzing emerging zero-day vulnerabilities, not only in Python but also in other structurally similar programming languages.
| 발행 연도 | 2025년 |
|---|---|
| 인용수 | 70 |
| 출판 국가 | Russia |
| 사이트 | IEEE |
| 좋아요 수 | 0 |