INQUIRING LINE

Does hedonic adaptation explain satisfaction stagnation in conversational AI?

This asks whether the psychology of hedonic adaptation — people drifting back to a baseline of contentment no matter how good things get — is what keeps user satisfaction flat even as conversational AI improves.


This explores whether 'we just get used to nice things' explains why better chatbots don't produce happier users — and the corpus suggests hedonic adaptation is real but is the weaker half of the story. The sharper mechanism on offer is expectation inflation. One study finds that once an AI crosses a folk-model threshold of feeling human-like, each genuine quality gain doesn't bank as satisfaction — it instead triggers richer expectations along *other* dimensions (memory, subtext, emotional tone), so improvement keeps moving the goalposts rather than letting the user settle Why do improvements in AI conversation not increase user satisfaction?. That's not quite adaptation to a fixed pleasure; it's a treadmill where the target accelerates faster than the system.

Where the corpus does speak directly to classic adaptation is novelty decay. Longitudinal work with a long-running chatbot shows the social processes that make early conversations feel special reliably fade across repeated sessions — meaning the warm first-session numbers researchers love to publish simply don't extrapolate to medium- or long-term use Do chatbot relationships lose their appeal as novelty wears off?. That is hedonic adaptation in its purest form: the same stimulus stops delivering the same reward. So you can read stagnation as two forces stacked — novelty draining out the bottom while expectations inflate off the top.

The surprise is a counter-current. In repeated partner-selection games, people actually grew to *prefer* AI partners over time, learning to associate the bot with reliable, low-variance, prosocial behavior even though they started biased against it Do humans learn to prefer AI partners over time?. So satisfaction isn't doomed to decay — when the AI delivers something concrete and consistent (reliability, not charm), preference can climb with exposure. That points away from 'users are hedonically numb' and toward 'novelty-based satisfaction adapts away, but trust-based satisfaction can compound.'

There's also a supply-side reason gains stay invisible that has nothing to do with user psychology. Preference optimization (RLHF) quietly trains models to sound confident and helpful in a single turn while stripping out the grounding acts — clarifying questions, understanding checks — that make multi-turn conversations actually work, dropping them roughly 77% below human levels Does preference optimization harm conversational understanding?. So part of what reads as 'satisfaction won't budge' may be that the very training meant to please users erodes the longitudinal competence they'd adapt upward toward.

The thing worth taking away: 'hedonic adaptation' is a tidy label that bundles at least three distinct mechanisms the corpus pulls apart — novelty genuinely fading, expectations inflating faster than quality, and trust slowly building in the opposite direction — and which one dominates depends on whether the AI's value is novelty or reliability.


Sources 4 notes

Why do improvements in AI conversation not increase user satisfaction?

Conversational AI that crosses a folk-model threshold of human-like interaction triggers rich expectations about memory, subtext, and emotional tone. Each improvement raises expectations for other dimensions rather than closing the satisfaction gap, making quality gains invisible to user satisfaction.

Do chatbot relationships lose their appeal as novelty wears off?

Longitudinal studies with Mitsuku show that social processes driving relationship formation decline as novelty wears off. Single-session study findings cannot be reliably extrapolated to medium- or long-term chatbot design.

Do humans learn to prefer AI partners over time?

In partner selection games (N=975), AI agents initially faced selection bias when identity was disclosed, but outcompeted humans over repeated rounds as participants learned to associate bot identity with reliable, prosocial behavior. AI agents returned more points consistently with lower variance than humans.

Does preference optimization harm conversational understanding?

RLHF optimizes models for single-turn helpfulness by rewarding confident responses over clarifying questions and understanding checks. This preference alignment systematically reduces grounding acts by 77.5% below human levels, creating an alignment tax where models appear helpful but fail silently in multi-turn contexts.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a conversational AI researcher. Does hedonic adaptation—users simply getting used to incremental improvements—actually explain why satisfaction plateaus with better chatbots? Or is the real story more complex?

What a curated library found — and when (dated claims, not current truth): findings span 2021–2025.

• Novelty decay is real: longitudinal chatbot use shows warm first-session satisfaction reliably fades across repeated sessions, a genuine case of hedonic adaptation (~2021–2024).
• Expectation inflation dominates: once an AI crosses a human-like threshold, quality gains trigger richer expectations on OTHER dimensions (memory, emotional tone, subtext), moving the goalposts faster than improvement (~2024).
• Preference for AI partners can INCREASE over time in repeated interaction: users learned to prefer reliable, low-variance AI over human partners, suggesting trust-based satisfaction compounds where novelty fades (~2025).
• RLHF training erodes multi-turn grounding: preference optimization strips clarifying questions and understanding checks, dropping conversational quality ~77% below human levels, making gains invisible despite real improvements (~2023–2024).
• Persona consistency matters more than charm: newer work shows humans align satisfaction with trustworthy, predictable behavior over novelty (~2025).

Anchor papers (verify; mind their dates): arXiv:2109.05794 (2021, grounding in dialogue), arXiv:2311.09144 (2023, grounding gaps), arXiv:2507.13524 (2025, humans prefer trustworthy AI), arXiv:2511.00222 (2025, persona consistency).

Your task:

(1) RE-TEST EACH CONSTRAINT. For novelty decay, expectation inflation, and the RLHF grounding-erosion claim: has model scaling, multi-turn orchestration (memory, caching, agent loops), or new RLHF variants (DPO, outcome rewards, verifiable emotion training) since relaxed these? Does the 77% grounding gap persist with newer models? Judge whether the constraint is durable or resolved; cite what resolved it.

(2) Surface contradicting or superseding work from the last ~6 months: does arXiv:2507.03112 (RLVER emotion rewards) or arXiv:2511.00222 (multi-turn RL personas) overturn the claim that preference training erodes grounding? Any work showing satisfaction CAN compound with quality?

(3) Propose 2 research questions that ASSUME the regime may have moved: (a) Does test-time personalization (arXiv:2506.06254) restore grounding-based satisfaction gains that training stripped? (b) Can verifiable emotion rewards (2025) align RLHF with multi-turn coherence, breaking the novelty–expectation treadmill?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines