Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency

Haoming Xu,Ningyuan Zhao,Yunzhi Yao,Weihong Xu,Hongru Wang,Xinle Deng,Shumin Deng,Jeff Z. Pan,Huajun Chen,Ningyu Zhang

Published 2026 in arXiv.org

ABSTRACT

As Large Language Models (LLMs) are increasingly deployed in real-world settings, correctness alone is insufficient. Reliable deployment requires maintaining truthful beliefs under contextual perturbations. Existing evaluations largely rely on point-wise confidence like Self-Consistency, which can mask brittle belief. We show that even facts answered with perfect self-consistency can rapidly collapse under mild contextual interference. To address this gap, we propose Neighbor-Consistency Belief (NCB), a structural measure of belief robustness that evaluates response coherence across a conceptual neighborhood. To validate the efficiency of NCB, we introduce a new cognitive stress-testing protocol that probes outputs stability under contextual interference. Experiments across multiple LLMs show that the performance of high-NCB data is relatively more resistant to interference. Finally, we present Structure-Aware Training (SAT), which optimizes context-invariant belief structure and reduces long-tail knowledge brittleness by approximately 30%. Code will be available at https://github.com/zjunlp/belief.

PUBLICATION RECORD

Publication year
2026
Venue
arXiv.org
Publication date
2026-01-09
Fields of study
Computer Science
Identifiers
DOI 10.48550/arXiv.2601.05905 arXiv 2601.05905
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Martingale Score: An Unsupervised Metric for Bayesian Rationality in LLM Reasoning
2025cited by this paper
Knowledge Editing Induces Underconfidence in Language Models
2025cited by this paper
Belief Dynamics Reveal the Dual Nature of In-Context Learning and Activation Steering
2025cited by this paper
Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?
2025cited by this paper
Accumulating Context Changes the Beliefs of Language Models
2025cited by this paper
CritiCal: Can Critique Help LLM Uncertainty or Confidence Calibration?
2025cited by this paper
A generalist medical language model for disease diagnosis assistance
2025cited by this paper
Steering Evaluation-Aware Language Models to Act Like They Are Deployed
2025cited by this paper
Believe It or Not: How Deeply do LLMs Believe Implanted Facts?
2025cited by this paper
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning
2025cited by this paper
Discursive Circuits: How Do Language Models Understand Discourse Relations?
2025cited by this paper
Language Models Can Learn from Verbal Feedback Without Scalar Rewards
2025cited by this paper
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers
2025cited by this paper
From Confidence to Collapse in LLM Factual Robustness
2025cited by this paper
Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty
2025cited by this paper
The Curious Case of Factuality Finetuning: Models' Internal Beliefs Can Improve Factuality
2025cited by this paper
Language Models Change Facts Based on the Way You Talk
2025cited by this paper
Are LLM Belief Updates Consistent with Bayes' Theorem?
2025cited by this paper
How Do People Revise Inconsistent Beliefs? Examining Belief Revision in Humans with User Studies
2025cited by this paper
Too Consistent to Detect: A Study of Self-Consistent Errors in LLMs
2025cited by this paper
Large language models could be rote learners
2025cited by this paper
Application of large language models in medicine
2025cited by this paper
BEDA: Belief Estimation as Probabilistic Constraints for Performing Strategic Dialogue Acts
2025cited by this paper
Measuring short-form factuality in large language models
2024cited by this paper
What large language models know and what people think they know
2024cited by this paper
Cutting Off the Head Ends the Conflict: A Mechanism for Interpreting and Mitigating Knowledge Conflicts in Language Models
2024cited by this paper
Detecting hallucinations in large language models using semantic entropy
2024cited by this paper
WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from Wikipedia
2024cited by this paper
Disentangling Memory and Reasoning Ability in Large Language Models
2024cited by this paper
Forking Paths in Neural Text Generation
2024cited by this paper
Are Large Language Models Ready for Healthcare? A Comparative Study on Clinical Language Understanding
2023cited by this paper
Large Language Models in Law: A Survey
2023cited by this paper
Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View
2023cited by this paper
The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
2023cited by this paper
Efficient Memory Management for Large Language Model Serving with PagedAttention
2023cited by this paper
Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence
2022cited by this paper
Language Models (Mostly) Know What They Know
2022cited by this paper
Self-Consistency Improves Chain of Thought Reasoning in Language Models
2022cited by this paper
OntoProtein: Protein Pretraining With Gene Ontology Embedding
2022cited by this paper
Entity-Based Knowledge Conflicts in Question Answering
2021cited by this paper
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
2018influential reference
Crowdsourcing Multiple Choice Science Questions
2017cited by this paper
Neural mechanisms of motivated forgetting
2014cited by this paper
A peer pressure experiment: Recreation of the Asch conformity experiment with robots
2014cited by this paper
The Persuasiveness of Source Credibility: A Critical Review of Five Decades' Evidence
2004cited by this paper
Suppressing unwanted memories by executive control
2001cited by this paper
Beyond the Purely Cognitive: Belief Systems, Social Cognitions, and Metacognitions As Driving Forces in Intellectual Performance
1983cited by this paper
Social Judgment: Assimilation and Contrast Effects in Communication and Attitude Change
1981cited by this paper
Differences Between Belief and Knowledge Systems
1979cited by this paper
Factors of source credibility
1968cited by this paper
Asch conformity studies: conformity to the experimenter and-or to the group.
1967cited by this paper
The influence of source credibility on communication effectiveness
1953cited by this paper

CITED BY

No citing papers are available for this paper.