Artificial Intelligence: Insufficient Security Incident Response

Insufficient AI security incident response occurs when organisations treat AI‑related issues like ordinary cyber incidents without accounting for AI‑specific triggers such as prompt injection, unsafe tool use, or compromised plugins. As AI becomes embedded in workflows and connected to sensitive data, tailored incident response processes and robust logging are essential to identify what went wrong, contain the impact, and recover effectively.

AI incidents can look like normal cyber incidents (malware, account compromise, data breach), but the trigger and evidence can be different when LLMs are involved (e.g. prompt injection, unsafe tool use, or a compromised plugin). Because AI tools are increasingly embedded into business workflows and connected to data sources, incident response planning and logging are critical to understand what happened, contain damage, and recover quickly.

This page gives a practical incident response approach for LLM-enabled systems, including:

Why AI-focused incident response matters

What to log so you can investigate properly

How to triage and contain AI-related incidents

Why it matters

AI incidents scale quickly

When an LLM is connected to email, file storage, internal documents, or business systems, the potential impact of a single failure can increase (e.g. more data exposure or unintended actions).

AI incidents are often “text-driven”

Some AI attacks and failures are triggered by natural language (prompts, documents, web pages, tickets), which can manipulate the model (e.g. prompt injection) or cause unsafe outputs/actions.

Evidence can disappear if you don’t plan ahead

If you don’t collect the right logs (prompts, tool calls, retrieved sources, plugin changes), it can be hard to prove what data was accessed, what actions occurred, or how the system was manipulated.

Security guidance stresses preparation + monitoring

Our guidance on incident response & recovery for smaller businesses highlights the importance of planning, identifying what’s happening, and using logs to help determine cause and impact.

NIST’s incident response guidance is designed to help organisations prepare, reduce impact, and improve detection/response/recovery effectiveness.

The UK’s AI Cyber Security Code of Practice includes requirements to create and maintain incident management and recovery plans and to monitor system behaviour.

What to log so you can investigate and determine impact

Identity and access

user identity (account ID), role, and authentication method

source IP/device ID (where possible)

session IDs / correlation IDs to link activity together

Prompt and conversation data

the prompt text (or a redacted version)

conversation/session ID and timestamps

model/system prompt version (so you know what rules existed at the time)

Practical tip: If storing full prompts is too sensitive, store:

a hashed copy + metadata (time/user/model)

redacted content (mask PII/secrets)

This balances forensic value and privacy, consistent with the idea that logs should support investigations while being protected.

Retrieval and data source traces

If you use chat with documents:

which documents/records were retrieved (IDs, paths, URLs)

which connectors were used (SharePoint, file shares, email)

access decisions (allowed/blocked)

Tool/connector/plugin actions

For assistants/agents with tools:

every tool call (tool name, parameters, target system)

whether the call was approved (human-in-the-loop)

result status (success/fail) and returned data size

Output handling

model output (or redacted output)

where it was rendered/used (web UI, email template, database query builder, script runner)

Configuration and change events

plugin/connector enablement/disablement

permission changes (scopes widened, new service accounts)

model version changes, policy/guardrail changes

key rotation events

Security telemetry

authentication failures and unusual sign-in patterns

DLP alerts (if used)

network egress anomalies (unexpected uploads)

endpoint alerts on agent hosts (malware/stealers)

How to triage security incidents

Step 1 — Confirm the incident type

Start by classifying the report into one (or more) of these:

Data exposure

Prompt injection / manipulation

Agent/tool abuse

Output-handling exploit (XSS/SSRF/code execution paths)

Supply chain issue

Step 2 — Decide severity

Use impact-based severity questions:

Confidentiality: Was sensitive data accessed or shared externally?

Integrity: Did the assistant change records, permissions, configurations, or content?

Availability: Is the system down, looping, or causing unexpected cost spikes?

Also ask:

Is the incident ongoing?

Does the agent have broad permissions?

Step 3 — Immediate containment

Choose containment actions based on type.

If it looks like data exposure:

Disable the affected connector/source (e.g. stop indexing or querying the dataset).

Restrict access to the AI feature to a smaller group until understood.

If it looks like prompt injection:

Block/strip the malicious content source (document/URL/email) and prevent it being retrieved again.

Add temporary guardrails: higher scrutiny, stricter output filtering, or human approval gates.

If it looks like agent/tool abuse:

Disable tool access (or switch to read-only tools) and require approvals for high-impact actions.

Revoke/rotate tokens and credentials used by the agent’s connectors.

If it looks like insecure output handling:

Disable any feature that renders/executes output (HTML preview, script execution, query execution) until outputs are sanitised/encoded.

If it looks like supply chain compromise:

Disable the suspicious plugin/tool/platform integration immediately.

Rotate keys/tokens the integration could have accessed and monitor for misuse.

Tip: Containment should be reversible and prioritise stopping ongoing harm first. This aligns with standard incident response outcomes (detect → respond → recover).

Step 4 — Scope the incident

Use the logs you collected to answer:

Which users/sessions were involved?

What prompts were used (or what content was retrieved)?

Which documents/records were accessed?

Which tools were called and what actions occurred?

Was data sent externally (emails, webhooks, uploads)?

Were there configuration changes (plugins enabled, permissions broadened)?

Step 5 — Eradicate and recover

Typical recovery steps include:

Patch/update affected components (including AI orchestration layers, connectors, and dependencies).

Reset credentials/tokens and reduce permissions (least privilege).

Restore from known good state if data integrity was impacted.

Re-enable features gradually with stronger controls (approvals, filtering, monitoring).

Building your incident response

A) Prepare

Identify critical systems and data your AI tools can access.

Maintain contact lists and escalation routes (IT provider, vendors, CSC reporting, regulator if relevant).

Define how to disable or isolate AI features quickly.

B) Enable evidence collection

Centralise logs and ensure sufficient retention to investigate.

Ensure logs are protected from unauthorised access/modification.

C) Test your response

Run tabletop exercises: prompt injection scenario, data disclosure scenario, compromised connector scenario.