SYNTHESIS NOTE

Do language models learn differently from good versus bad outcomes?

Do LLMs update their beliefs asymmetrically when learning from their own choices versus observing others? This matters for understanding whether agentic AI systems might inherit human cognitive biases.

Synthesis note · 2026-02-23 · sourced from Cognitive Models Latent

Using instrumental learning tasks adapted from cognitive psychology (multi-armed bandit variants), LLMs show a systematic optimism bias: they learn more from better-than-expected outcomes than from worse-than-expected ones when learning about their own chosen actions. Three properties of this bias parallel human cognition precisely:

Optimism for chosen actions — the model updates beliefs more strongly when outcomes exceed expectations than when they fall short
Reversal for counterfactual feedback — when learning about the value of the unchosen option, the bias reverses (pessimism about alternatives)
Disappearance without agency — when the model has no control over choices (passive observation), the asymmetry vanishes entirely

The meta-RL validation is critical: idealized in-context learning agents derived through meta-reinforcement learning — which converge onto Bayes-optimal strategies — exhibit the same three behavioral effects. This suggests the asymmetry may be rational rather than a bug. An optimistic agent that overweights positive outcomes from its own actions while underweighting positive outcomes from unchosen alternatives will exploit more aggressively, which can be optimal in certain bandit environments.

The agency-dependence is the most theoretically interesting aspect. The same model shows the bias when it perceives itself as an agent making choices but not when passively observing outcomes. This implies the bias is not a fixed property of the attention mechanism or the training distribution — it is context-dependent, activated by the framing of agency. Since Do large language models make the same causal reasoning mistakes as humans?, this adds another dimension: LLMs don't just replicate human causal reasoning biases but also human motivational biases that depend on perceived agency.

The practical implication for agentic AI: when LLMs are deployed as decision-making agents, they may systematically overweight evidence that their previous decisions were good and underweight evidence that alternative actions would have been better. This is precisely the pattern that produces confirmation bias in human decision-making — and it may be an emergent property of any sufficiently capable in-context learner, not a training artifact.

Inquiring lines that use this note as a source 31

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 5

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

15 direct connections · 174 in 2-hop network ·dense cluster Open in graph ↗

Do language models learn differently from good v… Do large language models make the same causal reas… Why do language models fail to act on their own re… Can transformers learn to solve new problems withi… Why do LLMs struggle with exploration in simple de… Do users worldwide trust confident AI outputs even…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Do large language models make the same causal reasoning mistakes as humans? Research on collider structures reveals whether LLMs share human biases in causal inference. This matters because if both fail identically, collaboration might reinforce rather than correct errors.
parallel: LLMs replicate structural biases in causal reasoning; this note adds motivational biases contingent on agency
Why do language models fail to act on their own reasoning? LLMs produce correct explanations far more often than they produce correct actions. What causes this knowing-doing gap, and can training methods close it?
related: the knowing-doing gap may partly reflect an optimism bias toward chosen actions
Can transformers learn to solve new problems within episodes? Explores whether transformer models can develop meta-learning abilities through RL training, enabling them to adapt to unseen environments by learning from within-episode experience alone, without updating weights.
mechanism: ICL meta-learning produces the same bias pattern as explicit meta-RL
Why do LLMs struggle with exploration in simple decision tasks? This explores why large language models fail at exploration—a core decision-making capability—even when they excel at other tasks, and what specific conditions might help them succeed.
exploration failure as downstream consequence: if agents are optimistically biased toward chosen actions, they will systematically under-explore alternatives — external summarization may succeed precisely because it provides objective history that bypasses the agent's biased belief tracking
Do users worldwide trust confident AI outputs even when wrong? Explores whether the tendency to over-rely on confident language model outputs transcends language and culture. Understanding this pattern is critical for designing safer human-AI interaction across diverse linguistic contexts.
user-side analog: asymmetric belief updating shows agents are optimistic about chosen actions, while overreliance shows users are optimistic about confident outputs — the same positive-signal bias operates at both the model decision level and the user trust level

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

in-context learning agents exhibit asymmetric belief updating — optimism bias for chosen actions reverses for counterfactual feedback and disappears without agency

Do language models learn differently from good versus bad outcomes?

Related concepts in this collection 5

Related papers in this collection 8

Search by related questions 4