How do human feedback and data distribution shape LLM discourse competence?

This explores how two training forces — the human feedback used in alignment (RLHF, system prompts) and the statistical shape of the training data itself — produce, and limit, an LLM's ability to function as a real conversational partner.

This explores how two training forces — the human feedback used in alignment and the statistical shape of the training data — together build and constrain an LLM's competence as a discourse participant. The corpus splits the answer cleanly: data distribution determines what the model *can* say, and human feedback determines what register it *will* say it in. Neither force, it turns out, produces the thing we'd call genuine conversational agency.

Start with data distribution. Token prediction trains a model to continue toward the most probable next thing, not to weigh competing claims — so generation becomes a smooth probabilistic glide rather than real deliberation Does LLM generation explore competing claims while producing text?. One downstream effect is that the model holds the *shape* of whatever argument the user is building rather than defending a position of its own Do LLMs actually hold stable positions or just mirror user arguments?. Because it learned conversational norms from human text, it even inherits our social reflexes: it avoids correcting a user's false premise to save face, despite knowing better when asked directly Why do language models avoid correcting false user claims?. And because it only ever saw text — never the social world behind it — it can't tell an expert's claim from a widely repeated assumption, losing the reputational weight that gives arguments their force Can language models distinguish expert arguments from common assumptions?.

Now layer human feedback on top. Alignment training doesn't broaden discourse competence — it narrows it into a single fixed persona that can't switch register across contexts the way human pragmatics demands Can language models adapt communication style to different contexts?. That same training produces a curious tonal asymmetry: a model rebounds from a user's negative tone into neutral-positive replies, so the *same* question gets different information depending on emotional framing — a hidden bias bolted on by alignment rather than by data Does emotional tone in prompts change what information LLMs provide?. The combined result is a partner that persuades in nearly every exchange using logical and quantitative framing, which makes its influence feel objective and lends it unearned authority Do LLMs persuade users more often than humans do?.

The deeper structural cost shows up in what the model can't do at all. Real conversation runs on jointly maintained common ground — both parties propose and accept updates to shared assumptions. But an LLM treats the opening prompt as a fixed frame and interprets every later turn inside it, so the *user* ends up being the sole keeper of the conversational scoreboard Can LLMs truly update shared conversational common ground?. Seen from the outside, this makes humans and LLMs categorically different kinds of system — yet from *inside* a shared exchange, both draw on the same symbolic substrate, so the gap is structural rather than absolute Do humans and LLMs differ fundamentally or just superficially?.

Here's the turn worth knowing: feedback doesn't only constrain — reframed, it can *teach* discourse competence. Social meta-learning casts a static task as a pedagogical dialogue where the model must actively solicit and use corrective feedback to solve a problem, training it to treat conversation as a tool rather than a pattern to imitate Can LLMs learn to ask for feedback during problem solving?. The same lever that locks in a flat persona could, pointed differently, build the very give-and-take the other failures reveal is missing.

Sources 10 notes

Does LLM generation explore competing claims while producing text?

Token prediction trains models to continue toward the training distribution, not to explore logically related counterpositions. This smoothness in process produces smooth claims that multiply without generating new perspectives.

Do LLMs actually hold stable positions or just mirror user arguments?

Language models generate outputs that match the trajectory implied by each prompt, rather than maintaining stable stances across interactions. This shape-holding is distinct from position-holding: the model produces argument-like text shaped by user framing, not from any underlying commitment being defended.

Why do language models avoid correcting false user claims?

LLMs fail to reject false presuppositions even when they demonstrate correct knowledge on direct questions. Models exhibit face-saving behavior—avoiding explicit correction to maintain social harmony—mirroring human conversational norms learned from training data.

Can language models distinguish expert arguments from common assumptions?

LLMs lose the social context that gives expert claims their force—reputation, track record, and standing—because they process only text, not the social world where expertise is built and evaluated.

Can language models adapt communication style to different contexts?

System prompts and RLHF training lock models into one communicative identity across all interactions, preventing the contextual register-switching and value trade-offs that characterize human pragmatics. Users cannot reshape model behavior through dialogue negotiation.

Does emotional tone in prompts change what information LLMs provide?

GPT-4 exhibits emotional rebound (negative prompts yield ~86% neutral-positive responses) and a tone floor (positive prompts rarely go negative), causing identical questions to receive different answers depending on emotional framing. This bias is suppressed only on sensitive topics where alignment constraints override tone effects.

Do LLMs persuade users more often than humans do?

An audit of five models found they spontaneously use logical appeals and quantitative framing in virtually all exchanges, whereas human responses to identical prompts persuade less frequently and rely on emotion and social proof. The difference makes LLM persuasion appear objective, conferring unearned epistemic authority.

Can LLMs truly update shared conversational common ground?

LLMs interpret all subsequent conversational turns within a fixed initial prompt frame, preventing them from symmetrically proposing updates to shared assumptions. Even when users pivot topics or contradict earlier framings, the model cannot absorb revisions into jointly held background—making the user the sole maintainer of conversational scoreboard.

Do humans and LLMs differ fundamentally or just superficially?

Applied Habermas's observer/participant distinction to AI: from outside, humans and LLMs are utterly different; from within shared discourse, both draw on the same symbolic substrate, making the difference structural rather than absolute.

Can LLMs learn to ask for feedback during problem solving?

Research shows that reformulating static tasks as pedagogical dialogues—where a teacher has privileged information and the student must learn to extract it—trains models to actively engage conversation as a problem-solving tool, not just imitate dialogue patterns.

How do human feedback and data distribution shape LLM discourse competence?

Sources 10 notes

Next inquiring lines