Research on path planning algorithm of partially observable penetration test based on reinforcement learning


연구 분야: Strategies



학회: ICMLCA '23: Proceedings of the 2023 4th International Conference on Machine Learning and Computer Application


초록

In this paper, a new deep reinforcement learning algorithm DEU-DRQNper is proposed for penetration test path planning under partially observable conditions. The algorithm is improved on the basis of DRQN algorithm, including the experience playback of priority sequence, the action exploration and selection strategy combining ɛ greedy algorithm and UCB algorithm, and the way of Double DQN calculating the target Q value. Priority sequence experience playback mechanism selects more valuable experience sequences according to the probability distribution corresponding to the priority of experience sequences during sampling to accelerate learning. The hybrid exploration strategy of ɛ greedy algorithm and UCB algorithm gives consideration to exploration and utilization, thus improving efficiency. The method of Double DQN calculating the target q value effectively solves the problem of overestimation of q value and makes learning more stable. Experiments in several simulation environments show that DEU-DRQNper achieves better performance in penetration test scenarios with incomplete state observation, which provides an effective solution for applying deep reinforcement learning to penetration test path planning with partial observability and sequence dependence. This study expands the application of deep reinforcement learning in the field of penetration testing.


Author Profile
Qinrui Sun

China Aerospace Academy of Systems Science and Engineering China

Andorra
Author Profile
Hui Ge

China Aerospace Academy of Systems Science and Engineering China

Andorra
Author Profile
Xiao Jin

China Aerospace Academy of Systems Science and Engineering China

Andorra

📄 논문 정보

발행 연도 2024년
인용수 1
출판 국가 Andorra
사이트 ACM
좋아요 수 0

연관 논문 목록 (178건)