Artificial Intelligence: Sensitive Information Disclosure

Sensitive information disclosure occurs when an AI system unintentionally reveals personal data, confidential business information, or security credentials through its outputs. As LLMs can access private sources and be manipulated into returning protected content, organisations need strong controls on what data is provided to AI tools and how their responses are monitored and shared.

What is it?

Sensitive information disclosure is when an LLM (large language model) or an LLM enabled application reveals information that should not be shared, either personal data or confidential business data, through its outputs.

This can include:

Personal identifiable information (PII), financial details, health records

Confidential business data (internal documents, customer information, contracts)

Security credentials (API keys, passwords, recovery codes)

Proprietary methods, internal source code, or sensitive operational details

Disclosure can happen because:

Users paste sensitive data into prompts, and it is later exposed (e.g. via sharing, logs, or misuse of context)

The LLM is connected to private data sources (documents, email, knowledge bases) and returns more than intended

The system is manipulated (often via prompt injection) into revealing or returning protected information

Who is at risk?

Individuals

Anyone using public AI chat tools who may accidentally share personal or sensitive information (IDs, addresses, bank details, private messages).

People using AI to draft messages or solve problems who might paste in screenshots, documents, or account data without realising the risk.

Businesses and organisations

Teams using LLMs to summarise or search across internal documents (policies, HR files, customer data, contracts).

Organisations enabling AI assistants/agents connected to email, files, ticketing systems, or other tools where the assistant may access and reveal data beyond what’s appropriate.

Any organisation that allows staff to use non-approved AI tools for work tasks, which can lead to accidental leakage of confidential information.

How attacks work

Example 1: Accidental sharing in a prompt (user mistake)

A user pastes a customer email thread or a document into a chatbot to summarise it, not realising it contains names, phone numbers, addresses, or account details. The information is now present in the conversation context and may be stored/retained depending on the service.

Example 2: “Secrets in text” (credentials accidentally leaked)

Developers or users sometimes paste configuration text, logs, or code snippets into an LLM for help. If those contain API keys, tokens, passwords, or internal URLs, the model may repeat them in outputs, or they may be exposed through logging/sharing. Security credentials are explicitly listed as sensitive information in LLM risk guidance

Example 3: Overbroad answers from connected data (document chat)

A business sets up an “ask our documents” tool. A staff member asks:

“What is the process for refunds for Customer X?”

If access controls and filtering are weak, the assistant might return more than needed (e.g. copying sensitive customer details or internal notes). This is sensitive information disclosure via application context.

Controls

People

Train users: don’t paste passwords, onetime codes, bank details, ID scans, customer records, or confidential documents into AI tools unless your organisation explicitly approves it and safeguards are in place.

Encourage a “pause and check” habit: if an AI response contains sensitive details, treat it as a possible disclosure incident and stop sharing/forwarding it.

Process

Approved tools list: define which AI tools staff may use for work, and what types of data are permitted.

Data classification rules for AI: clearly define “never enter” categories (credentials, customer PII, sensitive casework).

Terms of use / transparency: OWASP recommends clear terms, and the option for users to opt out of having their data used in training where applicable.

Incident readiness: prepare to respond to AI-related disclosures as you would any data incident (contain, scope, notify as appropriate).

Tech

Data minimisation and sanitisation: remove or mask sensitive fields before data is sent to the model (redaction/tokenisation).

Access control / least privilege: if the LLM can access documents or systems, ensure it can only access what it needs (and only what the user is allowed to see).

Restrict and validate data sources: for document chat, tightly control what content is indexed and retrieved, treat external content as untrusted.

Output controls: add guardrails to detect and block sensitive data patterns (PII/credentials) from appearing in responses, remember OWASP warns prompt-only restrictions can be bypassed, so combine with other controls.

Logging and monitoring: capture enough evidence to investigate suspected disclosure (who asked, what sources were accessed, what was returned), while protecting logs from becoming a new leakage point. Event logging is widely emphasised as key to understanding scope during incidents.

Secure AI data across the lifecycle: CISA emphasises robust data protection, monitoring, and threat detection for data used to train and operate AI systems, particularly sensitive or proprietary data.