0

LLM Abstention Can Be a Prompt Artifact, in Addition to Genuine Uncertainty

Large Language Models (LLMs) are increasingly trained to abstain from answering questions they are unsure about. However, this ability is often misused: in real-world applications, input prompts sometimes contain uncertainty elements, and driven by this, LLMs are inclined to…

Year
2025
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2507.16199ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Large Language Models (LLMs) are increasingly trained to abstain from answering questions they are unsure about. However, this ability is often misused: in real-world applications, input prompts sometimes contain uncertainty elements, and driven by this, LLMs are inclined to abstain even on problems they are capable of solving. We argue that LLM abstention is not only an expression of genuine uncertainty; it is also an artifact that can be largely influenced by prompts. We name this phenomenon Abstention Inflation. We add "Unknown" as an extra option for LLMs to choose from; experiments show serious accuracy drops on True/False Questions (TFQs). Replacing "Unknown" with an unrelated random word produces an identical effect. We argue that LLMs are trained to imitate the surface pattern of abstention, rather than to express genuine uncertainty. Based on ten experiments, we support four claims that form a progressive argument: (C1) Abstention Inflation is triggered by the structural presence of an extra option, not by genuine uncertainty; (C2) further, it makes the model deny it can answer even when it can; (C3) at the representation level, this manifests as a later-layer output override; (C4) finally, this bias is stable and emerges through instruction tuning, rather than stochastic noise.