연구 분야: Strategies
학회: ICMLCA '23: Proceedings of the 2023 4th International Conference on Machine Learning and Computer Application
In this paper, a new deep reinforcement learning algorithm DEU-DRQNper is proposed for penetration test path planning under partially observable conditions. The algorithm is improved on the basis of DRQN algorithm, including the experience playback of priority sequence, the action exploration and selection strategy combining ɛ greedy algorithm and UCB algorithm, and the way of Double DQN calculating the target Q value. Priority sequence experience playback mechanism selects more valuable experience sequences according to the probability distribution corresponding to the priority of experience sequences during sampling to accelerate learning. The hybrid exploration strategy of ɛ greedy algorithm and UCB algorithm gives consideration to exploration and utilization, thus improving efficiency. The method of Double DQN calculating the target q value effectively solves the problem of overestimation of q value and makes learning more stable. Experiments in several simulation environments show that DEU-DRQNper achieves better performance in penetration test scenarios with incomplete state observation, which provides an effective solution for applying deep reinforcement learning to penetration test path planning with partial observability and sequence dependence. This study expands the application of deep reinforcement learning in the field of penetration testing.
| 발행 연도 | 2024년 |
|---|---|
| 인용수 | 1 |
| 출판 국가 | Andorra |
| 사이트 | ACM |
| 좋아요 수 | 0 |