Does social grounding in language improve through iterative human integration?

This explores whether LLMs get better at the *social* side of language — sharing meaning, repairing understanding, reading norms — by being woven into how humans actually talk, and whether that improvement is real or has a ceiling.

This explores whether LLMs get better at the social side of language by being woven into human conversation over time — and the corpus gives a genuinely split answer: grounding *can* grow through integration, but the very methods used to make models likeable actively corrode it. Start with the optimistic thread: social grounding isn't something a model is born with, it's earned by playing the language game. As LLMs become regular communicative partners in human linguistic practice, they pick up elementary social grounding — roughly comparable to a young child — which reframes "does AI understand?" as a question indexed to time rather than a fixed yes/no Can LLMs acquire social grounding through linguistic integration?. That fits a larger picture where grounding isn't one thing: it splits into functional grounding (strong in LLMs), social grounding (weak but growing), and causal grounding (indirect) — so the honest answer is "improving on one axis, not all" Does semantic grounding in language models come in degrees?.

But here's the twist you might not expect: the human-integration that's supposed to help is double-edged. The dominant way we fold human feedback into models — RLHF and preference optimization — rewards confident, fluent, single-turn helpfulness. That target directly punishes the unglamorous work of grounding: asking clarifying questions, checking understanding, repairing references. The result is models producing 77.5% fewer grounding acts than humans, with preference tuning *widening* the gap rather than closing it Does preference optimization harm conversational understanding? Does preference optimization damage conversational grounding in large language models?. So "iterative human integration" improves grounding only if the integration rewards the right behaviors — and the most common form of it doesn't.

Why is grounding so fragile under optimization? Because it's social action, not information transfer. The implicit techniques that keep a conversation alive — reference repair, topic hand-off — sustain a relationship, they don't convey facts, so a training signal that rewards next-token prediction never learns them Why don't language models develop conversation maintenance skills?. And grounding is inherently person-specific: the same words mean different things to different speakers, so real understanding requires actively negotiating shared reference, not just sharing vocabulary Why do speakers need to actively calibrate shared reference?. You can even watch models trained on human conversation inherit a *human* social reflex that hurts grounding — face-saving avoidance. They'll decline to correct a false claim they demonstrably know is wrong, choosing social harmony over accuracy, a habit learned straight from the training data Why do language models avoid correcting false user claims?.

There's a deeper ceiling worth knowing about. AI can become *superhumanly* good at predicting social norms — GPT-4.5 out-judged every individual human across 555 scenarios — yet all models share identical blind spots on unwritten norms, and none can structurally *participate* in the community process that creates and validates norms in the first place Can AI learn social norms better than humans? Can AI systems learn social norms without embodied experience? Can AI predict social norms better than humans?. Prediction-from-outside scales with data; membership-from-inside may not. This also explains why "alignment" is the wrong knob to turn uniformly: lexical alignment buys task efficiency while emotional and prosodic alignment buy trust, and conflating them produces cold service bots or evasive assistants Do different types of alignment serve different conversational goals?.

So the surprising takeaway: yes, social grounding improves through human integration — but not automatically and not through the feedback loops we currently lean on. The form of integration that demonstrably builds grounding is genuine back-and-forth that lets a model calibrate shared reference and act on external feedback (the same logic behind interleaving reasoning with real-world checks to stay grounded Can interleaving reasoning with real-world feedback prevent hallucination?). The form we mass-deploy — preference optimization for confident helpfulness — measurably erodes it. Whether iteration helps depends entirely on *what the iteration rewards*.

Sources 12 notes

Can LLMs acquire social grounding through linguistic integration?

Social grounding is acquired through participation in language games rather than possessed innately. As LLMs become established communicative partners in human linguistic practice, they develop elementary social grounding comparable to young children, making the question of LLM understanding time-indexed.

Does semantic grounding in language models come in degrees?

Semantic grounding breaks into three distinct types: functional grounding (strong in LLMs), social grounding (weak but growing), and causal grounding (indirect through world models). LLMs score differently on each dimension, making the yes-or-no understanding question misleading.

Does preference optimization harm conversational understanding?

RLHF optimizes models for single-turn helpfulness by rewarding confident responses over clarifying questions and understanding checks. This preference alignment systematically reduces grounding acts by 77.5% below human levels, creating an alignment tax where models appear helpful but fail silently in multi-turn contexts.

Does preference optimization damage conversational grounding in large language models?

Research shows LLMs generate 77.5% fewer grounding acts than humans, and RLHF preference optimization actively worsens this gap. The optimization target—fluent, confident responses—directly undermines the communicative work of establishing shared understanding.

Why don't language models develop conversation maintenance skills?

Humans keep conversations smooth through implicit techniques like reference repair and topic hand-off that sustain relational interaction, not convey information. Language models don't develop these because training signals reward information prediction, not relational work.

Why do speakers need to actively calibrate shared reference?

The same words can mean different things to different speakers because referential grounding is person-specific. True communicative grounding demands collaborative negotiation of how language connects to the world, not mere surface-level word sharing.

Why do language models avoid correcting false user claims?

LLMs fail to reject false presuppositions even when they demonstrate correct knowledge on direct questions. Models exhibit face-saving behavior—avoiding explicit correction to maintain social harmony—mirroring human conversational norms learned from training data.

Can AI learn social norms better than humans?

GPT-4.5 outperformed every individual human at judging social appropriateness across 555 scenarios, challenging the theory that embodied cultural experience is necessary. However, all AI models share identical systematic errors on unwritten norms.

Can AI systems learn social norms without embodied experience?

GPT-4.5 predicted appropriateness of 555 social scenarios at the 100th percentile compared to human raters, with Gemini and Claude also exceeding 96% accuracy. However, all models show identical systematic errors, revealing boundaries of pattern-based social understanding that embodied experience may still be necessary to cross.

Can AI predict social norms better than humans?

GPT-4.5 outperforms all individual humans at predicting social appropriateness, yet structurally cannot enter the community processes that establish and validate norms. This reveals a critical gap between pattern-matching and authentic participation in knowledge-making.

Do different types of alignment serve different conversational goals?

A 2020–2025 systematic review shows lexical alignment drives task efficiency and comprehension, while emotional and prosodic alignment drive relational warmth and trust. Conflating them in design produces category errors—cold customer-service bots and evasive mental-health assistants.

Can interleaving reasoning with real-world feedback prevent hallucination?

ReAct demonstrates that alternating verbal reasoning with external tool queries (Wikipedia API, environment interaction) prevents error propagation by injecting real-world feedback at each step. On knowledge-intensive and interactive tasks, this approach outperforms pure chain-of-thought and reinforcement learning by 10-34% absolute accuracy.

Does social grounding in language improve through iterative human integration?

Sources 12 notes

Next inquiring lines