Large Language Models (LLMs) are increasingly trained to abstain from answering questions they are unsure about. However, this ability is often misused: in real-world applications, input prompts sometimes contain uncertainty elements, and driven by this, LLMs are inclined to abstain even on problems they are capable of solving. We argue that LLM abstention is not only an expression of genuine uncertainty; it is also an artifact that can be largely influenced by prompts. We name this phenomenon Abstention Inflation. We add "Unknown" as an extra option for LLMs to choose from; experiments show serious accuracy drops on True/False Questions (TFQs). Replacing "Unknown" with an unrelated random word produces an identical effect. We argue that LLMs are trained to imitate the surface pattern of abstention, rather than to express genuine uncertainty. Based on ten experiments, we support four claims that form a progressive argument: (C1) Abstention Inflation is triggered by the structural presence of an extra option, not by genuine uncertainty; (C2) further, it makes the model deny it can answer even when it can; (C3) at the representation level, this manifests as a later-layer output override; (C4) finally, this bias is stable and emerges through instruction tuning, rather than stochastic noise.
LLM Abstention Can Be a Prompt Artifact, in Addition to Genuine Uncertainty
Large Language Models (LLMs) are increasingly trained to abstain from answering questions they are unsure about. However, this ability is often misused: in real-world applications, input prompts sometimes contain uncertainty elements, and driven by this, LLMs are inclined to…
- Year
- 2025
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2507.16199ARXIV-DEFAULT
- TL;DR
- Semantic Scholar