Artificial Intelligence: Overreliance

Overreliance on AI occurs when users or systems place too much trust in LLM outputs, leading to incorrect decisions, security risks, or unintended actions. Because AI can generate confident‑sounding but inaccurate responses, organisations need safeguards to ensure outputs are checked, validated, and used responsibly.

What is it?

Overreliance is a vulnerability that happens when people or systems treat an LLM’s output as trusted and correct, without appropriate checks. This can lead to poor decisions, security mistakes, or unsafe actions, especially because LLMs can generate convincing text that is incorrect, incomplete, or misleading.

LLMs are particularly risky to over-trust because:

They can produce false information (hallucinations) while sounding confident

They can be gullible and influenced by leading prompts

They can be manipulated by attackers (e.g. via prompt injection) into unsafe outputs

People may assume “the AI checked it,” when it may not have.

Overreliance becomes a bigger issue when LLM outputs are used to:

Make business or security decisions

Produce customer-facing responses

Generate code/configuration

Trigger actions through automated workflows

Who is at risk?

Individuals

People using AI for advice may be misled into unsafe steps (clicking links, installing tools, sharing information, or following inaccurate guidance).

Anyone who treats AI summaries as fact without checking original sources is at higher risk of misunderstanding or acting on incorrect information.

Businesses and organisations

Teams using LLMs for research, reports, policy, or decision support, as errors can create financial, legal, or reputational risk.

Developers and IT staff using LLMs for troubleshooting or generating code, where incorrect or unsafe instructions can weaken security.

Organisations deploying customer chatbots or internal assistants, where confident but wrong outputs can mislead users or cause operational harm.

How attacks work

Example 1: “Confident nonsense” used to mislead

A user asks an LLM for urgent advice (“What should I do about this suspicious email?”). The LLM provides a confident but incorrect answer (e.g. “It looks safe. click the link”). If the user trusts it without checking, they may fall for a scam. LLMs are known to produce incorrect statements that appear plausible.

Example 2: Unsafe instructions accepted as best practice

A user asks an LLM for steps to fix a system problem and gets a risky suggestion (e.g. disabling security settings, copying secrets into a tool, running commands they don’t understand). If followed blindly, this can create vulnerabilities. Treating LLM outputs as untrusted is important because LLMs can be wrong and can be influenced by adversarial inputs.

Example 3: Bad information flows into business decisions

A manager asks an LLM to summarise a document or topic and uses the summary for decisions. If the LLM misses key details, invents facts, or misrepresents the source, this can lead to incorrect actions.

Controls

People

Set the mindset: LLM outputs are suggestions, not facts. Encourage users to verify important claims and treat outputs as untrusted, especially for security, legal, medical, or financial topics.

Train staff on common failure modes: hallucinations, persuasive tone, and susceptibility to manipulation.

Process

Define “safe uses” vs “high-risk uses”:

- Safer: drafting generic text, summarising non-sensitive public material (with checks).

- Higher risk: decisions, security configuration, code deployment, legal/HR decisions, anything with sensitive data.

Require human review for high-impact outputs (customer advice, policy decisions, system changes, external communications).

Use escalation paths: if an answer affects safety, money, legal position, or security, it should be checked by a qualified person or trusted source.

Tech

Make verification easy: link to sources, keep citations, and encourage users to open the original document rather than trusting summaries alone (especially for internal decision-making). This directly mitigates the incorrect information risk.

Prevent auto-action based on AI output: avoid systems where LLM responses automatically trigger changes, emails, payments, scripts, or security configuration without checks. This reduces the impact of wrong or manipulated outputs.

Use guardrails for sensitive domains: implement internal policies or technical controls that block risky behaviours (e.g. pasting secrets, sharing customer data, or generating high-risk instructions). Overreliance risks increase when LLMs are connected to tools and data, so restricting access reduces impact.