Skip to main content

Why AI security is different

Traditional software follows strict rules, LLMs interpret language and context, which means they can be manipulated through text (including hidden instructions). When LLMs connect to email, file storage, ticketing systems, or plugins, the impact of a mistake increases because the AI can become a pathway to sensitive data or actions.  

AI should be approached “secure by design” across the lifecycle, not bolted on as an afterthought, especially given the pace of change and evolving threats.  

 

The main risks organisations should plan for

1) Sensitive information disclosure 

Staff may paste confidential information into prompts, or an AI tool may expose internal data if it is connected to company documents or systems. This can create privacy, contractual, and reputational risk.  

2) Prompt injection and manipulated outputs 

Attackers can craft inputs (or hide instructions in documents/web pages/tickets) that cause the LLM to ignore rules or behave in unintended ways. This becomes more serious when the AI can call tools or retrieve internal documents.  

3) Excessive permissions (agents and assistants) 

If an AI assistant has broad access to email, files, or business apps, a single tricked interaction can lead to unnecessary data exposure or unauthorised actions. Limiting permissions reduces the blast radius.  

4) Plugins/tools - Supply-chain risks 

Many AI tools allow third-party plugins. Some may be poorly designed, others may be actively malicious.  

5) Overreliance and unsafe automation 

LLMs can generate plausible but incorrect content. Overreliance can lead to errors (e.g. policy decisions based on false summaries, or staff following unsafe instructions). Human review remains important, especially for high-impact decisions.  

 

Practical controls

1) Set clear rules for safe use  

At minimum, define: 

  • Which AI tools are approved for staff use  
  • What data must never be entered (e.g. credentials, customer data, sensitive casework)  
  • When human review is required (e.g. external communications, legal content, security changes)  

2) Reduce data exposure (start with least privilege) 

If your AI tool connects to internal sources: 

  • Limit what it can access (only the folders, mailboxes, or systems required)  
  • Separate read access from write access where possible (harder for an AI to do something unintended)  
  • Apply role-based access and MFA to the accounts used for AI integrations  

3) Assume prompt injection is possible  

You don’t need deep technical knowledge to make a meaningful improvement: 

  • Treat any external content the AI reads (web pages, emails, attachments) as untrusted  
  • Don’t let AI output automatically trigger actions (avoid auto-run workflows without checks)  
  • Use human approval for high-risk actions (sending emails, changing permissions, running scripts)  

4) Control plugins and tools  

Treat AI plugins like software installations: 

  • Restrict who can enable or install plugins/tools  
  • Prefer vetted/reputable sources and review what access a plugin requests  
  • Have a plan to rotate credentials/tokens quickly if a plugin is suspected compromised  

5) Secure procurement: ask suppliers the right questions 

When buying AI products or enabling enterprise AI features, ask suppliers: 

  • What data is stored, where, and for how long?  
  • Can customers opt out of data retention/training, and how is this enforced?  
  • How are plugins/tools reviewed, and how quickly are security issues patched?  
  • What monitoring/logging is available for prompts, retrieval sources, and tool actions?  
  • What incident reporting and support is available if something goes wrong?  

 

AI Security Incident Response

What to log:

LLM incidents can involve prompt injection, tool misuse, plugin compromise, or sensitive data exposure. To investigate effectively, keep logs of: 

  • Who used the AI tool (user/service account) and when  
  • The prompt or request (redacting sensitive data where appropriate)  
  • What data sources were accessed (documents, repositories, URLs, connectors)  
  • Any tool/plugin actions taken (emails sent, files accessed/changed, API calls)  
  • Admin changes (new plugins enabled, permissions changed)  

How to triage:

If you suspect an AI-related security incident: 

  1. Contain: disable suspect plugins/tools, revoke/rotate tokens and credentials if needed  
  2. Scope: check logs for what data sources were accessed and what actions occurred  
  3. Protect: restrict access permissions further (least privilege) and add approval steps for risky actions  
  4. Recover: restore affected systems/data where necessary and communicate appropriately