Abstract

Large language models (LLMs), a type of generative AI, were popularized by ChatGPT in 2022 (Hu & Hu, 2023), are well- suited for cybersecurity incident response (CSIR) due to their ability to analyze extensive logs for attack detection and response. LLMs are primarily used in CSIR for analyzing the vast data acquired by cybersecurity teams (Bokkena, 2024; Ferrag et al., 2025; Ji et al., 2024; Xu et al., 2024). CSIR is somewhat unique in that LLMs intentionally ingest malicious input. An attacker could directly trigger cost harvesting by overwhelming security logs or cause prompt injection through malicious logs. No single LLM risk framework comprehensively addresses all risks, as noted in the MIT literature survey (Slattery et al., 2024). This paper aims to consolidate various LLM and CSIR risk frameworks into a comprehensive resource to answer: What are the risks of using LLMs in CSIR?

Risk frameworks for LLMs and CSIR were identified and summarized into a unified framework. Three main LLM risk frameworks were identified, NIST 600-1, OWASP Top 10 for LLMs, and MITRE ATLAS(ATLAS Matrix, 2025; OWASP LLM, 2025; National Institute of Standards and Technology (US), 2024). No CSIR risk frameworks were identified, as NIST 800-61 outlines the CSIR processes and activities but omits a risk framework. We found literature showing cybersecurity incidents are a form of an emergency (Onwubiko & Ouazzane, 2022) and thus emergency management (EM) risks are related. Radianti and Khazanchi (2024) proposed a framework for assessing risks in emergency management (EM) (Radianti & Khazanchi, 2024). The applicability of the Radianti et al. risk framework to CSIR was validated using NIST 800-61 as a reference.

Our proposed unified risk model categorizes risks into three main areas and two subcategories. The three main categories (technical, application/user, business) are based on an assurance framework presented by Khazanchi and Sutton (2001). The technical category encompasses attack surface risks and amplified threat risks, aligning with the framework’s focus on infrastructure security and data integrity. The application and user category addresses both adoption and operational risks, encompassing challenges related to user integration, implementation, and organizational readiness. The business category examines communication and decision-making risks, highlighting the strategic and operational implications of LLM adoption.

CSIR has unique risks due to collecting malicious logs, such as cost harvesting and prompt injection. Future research should focus on identifying additional CSIR risks, operationalizing assessment and assurance of these risks, developing innovative mitigations for key risks—such as prompt injection—and creating an assurance framework to help adoption and integration of LLMs for CSIR.

Abstract Only

Share

COinS