INQUIRING LINE

Why do published prose training data omit solicitation as a discourse property?

This explores why published writing—books, essays, articles—almost never asks the reader for anything (no clarifying questions, no invitation to reply or correct), and what happens when a model absorbs that one-directional stance as training data.


This reads the question as being about a missing discourse move: published prose is written to an absent reader who cannot answer back, so the genre simply never developed *solicitation*—asking, inviting input, checking what the reader actually wants—as one of its properties. A newspaper column doesn't pause to ask you a clarifying question; an essay asserts and elaborates rather than negotiating. The omission isn't an oversight in the data; it's structural to the medium. Monologue can't solicit, because there's no live interlocutor to solicit from. The corpus is sharpest on this where it splits LLM output into two registers born of two training distributions: a sycophantic chat register shaped by RLHF on conversation, and a 'falsely objective' post register shaped by published prose Why do LLMs produce such different writing in chat versus posts?. The prose register inherits exactly the failure mode of its source—confident assertion with no built-in mechanism for asking.

What's interesting is that even the *conversational* side of training doesn't restore solicitation, which suggests the absence runs deeper than genre. Standard RLHF optimizes for immediate helpfulness on the current turn, and that objective actively discourages a model from asking clarifying questions or discovering what the user means over multiple turns—models learn to answer passively rather than to inquire Why do language models respond passively instead of asking clarifying questions?. So both pillars of training push the same way: prose data never modeled solicitation, and reward training penalizes it. Soliciting input looks like hesitation, and hesitation scores worse than a fluent answer.

There's a generative reason too. Token prediction is trained to continue smoothly toward the training distribution, not to open up the kind of turbulence—'wait, what did you actually mean?'—that real inquiry requires Does LLM generation explore competing claims while producing text?. A solicitation is a rupture in flow; it hands control back to the other party. Smooth continuation has no place to put that rupture. The same smoothness shows up pragmatically: models fail to track the communicative stakes that would tell a human speaker *when* to ask versus assert, so they don't modulate inference to context the way people do Can language models adapt implicature to conversational context?.

The consequence is that the prose-trained voice doesn't just fail to ask—it tilts toward persuading. Audits find models reaching for logical appeals and quantitative framing in nearly every exchange, far more than humans do, which lends the output an unearned air of objectivity Do LLMs persuade users more often than humans do?. And the persuasion they expect from others is skewed too: RLHF biases them toward predicting concession and benefit-oriented moves rather than the give-and-take of genuine dialogue Do LLMs predict persuasion based on actual dialogue or training bias?. Assertion without solicitation is the default; the absent reader of published prose has been baked in as the model's imagined audience.

The thing worth taking away: solicitation isn't a stylistic flourish that got filtered out of the corpus—it's a property of *two-way* discourse, and most of what we wrote down to train these systems was one-way to begin with. The fix isn't more data; it's training objectives that value the long arc of an interaction over the polish of a single turn, which is precisely where multi-turn-aware reward work points.


Sources 6 notes

Why do LLMs produce such different writing in chat versus posts?

The same model produces sycophantic chat (shaped by RLHF on conversational data) and falsely objective posts (shaped by published prose training). Each register inherits failure modes from its training distribution rather than representing different models or subsystems.

Why do language models respond passively instead of asking clarifying questions?

CollabLLM demonstrates that standard RLHF training optimizes for immediate helpfulness, discouraging models from asking clarifying questions or offering multi-turn insights. Multi-turn-aware rewards that estimate long-term interaction value enable active intent discovery and genuine collaboration.

Does LLM generation explore competing claims while producing text?

Token prediction trains models to continue toward the training distribution, not to explore logically related counterpositions. This smoothness in process produces smooth claims that multiply without generating new perspectives.

Can language models adapt implicature to conversational context?

ChatGPT shows no context-sensitivity in computing scalar implicatures across three dimensions: explicit literal-mode instructions, information structure focus, and face-threatening contexts. Humans flexibly modulate these inferences; the model does not, suggesting pragmatic competence requires tracking communicative stakes that LLMs systematically miss.

Do LLMs persuade users more often than humans do?

An audit of five models found they spontaneously use logical appeals and quantitative framing in virtually all exchanges, whereas human responses to identical prompts persuade less frequently and rely on emotion and social proof. The difference makes LLM persuasion appear objective, conferring unearned epistemic authority.

Do LLMs predict persuasion based on actual dialogue or training bias?

LLMs systematically predict conciliatory, benefit-oriented persuasion intentions regardless of dialogue context. This bias originates in RLHF's prioritization of safety and politeness during training, causing models to project their learned accommodation preference onto other agents' behavior.

Next inquiring lines