Paper Number

1740

Paper Type

Complete

Description

Patching in a timely manner has proven to be one of the most effective ways to protect enterprise information systems from cyberattacks. However, as patching operations are not cost-free, enterprises typically delay patch deployments. Balancing operational expenses and system security risks to determine the optimal timing of patching remains an ongoing challenge. This study addresses the patching decision problem by proposing a novel deep reinforcement learning-based approach. Specifically, we model the patching problem as a Markov decision process with a thorough consideration of various costs, dynamics, and uncertainties. To avoid the curse of dimensionality and obtain an effective patching policy, we develop a novel reinforcement learning method called Action-Decomposed Proximal Policy Optimization (ADPPO). Experimental results indicate that the proposed approach significantly outperforms benchmarks. This study contributes to both the cybersecurity management and the reinforcement learning communities.

Comments

06-Security

Share

COinS
 
Dec 15th, 12:00 AM

Enterprise Security Patch Management with Deep Reinforcement Learning

Patching in a timely manner has proven to be one of the most effective ways to protect enterprise information systems from cyberattacks. However, as patching operations are not cost-free, enterprises typically delay patch deployments. Balancing operational expenses and system security risks to determine the optimal timing of patching remains an ongoing challenge. This study addresses the patching decision problem by proposing a novel deep reinforcement learning-based approach. Specifically, we model the patching problem as a Markov decision process with a thorough consideration of various costs, dynamics, and uncertainties. To avoid the curse of dimensionality and obtain an effective patching policy, we develop a novel reinforcement learning method called Action-Decomposed Proximal Policy Optimization (ADPPO). Experimental results indicate that the proposed approach significantly outperforms benchmarks. This study contributes to both the cybersecurity management and the reinforcement learning communities.