Cyber Operations, Defense, and Forensics

Safe Reinforcement Learning via Observation Shielding

Joe Mccalmon, Wake Forest UniversityFollow
Tongtong Liu, Wake Forest UniversityFollow
Reid Goldsmith, Wake Forest UniversityFollow
Andrew Cyhaniuk, Wake Forest UniversityFollow
Talal Halabi, University of WinnipegFollow
Sarra Alqahtani, Wake Forest UniversityFollow

Location

Online

Event Website

https://hicss.hawaii.edu/

Start Date

3-1-2023 12:00 AM

End Date

7-1-2023 12:00 AM

Description

Reinforcement Learning (RL) algorithms have shown success in scaling up to large problems. However, deploying those algorithms in real-world applications remains challenging due to their vulnerability to adversarial perturbations. Existing RL robustness methods against adversarial attacks are weak to large perturbations - a scenario that cannot be ruled out for RL adversarial threats, as is the case for deep neural networks in classification tasks. This paper proposes a method called observation-shielding RL (OSRL) to increase the robustness of RL against large perturbations using predictive models and threat detection. Instead of changing the RL algorithms with robustness regularization or retrain them with adversarial perturbations, we depart considerably from previous approaches and develop an add-on safety feature for existing RL algorithms during runtime. OSRL builds on the idea of model predictive shielding, where an observation predictive model is used to override the perturbed observations as needed to ensure safety. Extensive experiments on various MuJoCo environments (Ant, Hooper) and the classical pendulum environment demonstrate that our proposed OSRL is safer and more efficient than state-of-the-art robustness methods under large perturbations.

Recommended Citation

Mccalmon, Joe; Liu, Tongtong; Goldsmith, Reid; Cyhaniuk, Andrew; Halabi, Talal; and Alqahtani, Sarra, "Safe Reinforcement Learning via Observation Shielding" (2023). Hawaii International Conference on System Sciences 2023 (HICSS-56). 2.
https://aisel.aisnet.org/hicss-56/st/digital_forensics/2

Download

COinS

Jan 3rd, 12:00 AM Jan 7th, 12:00 AM

Safe Reinforcement Learning via Observation Shielding

Online

https://aisel.aisnet.org/hicss-56/st/digital_forensics/2

Cyber Operations, Defense, and Forensics

Safe Reinforcement Learning via Observation Shielding

Location

Event Website

Start Date

End Date

Description

Recommended Citation

Search

Browse

Author Corner

Cyber Operations, Defense, and Forensics

Safe Reinforcement Learning via Observation Shielding

Presenter Information

Location

Event Website

Start Date

End Date

Description

Recommended Citation

Share

Search

Browse

Author Corner