INQUIRING LINE

How do readers project author identity from textual cues during interpretation?

This explores how readers build a mental picture of *who wrote this* — confidence, status, character, expertise — from cues on the page, and how fragile and projection-driven that act turns out to be.


This explores how readers construct an author's identity from textual cues — and the corpus suggests that what feels like "reading the writer off the page" is really an act of projection, easily steered by surface signals and easily distorted by who's doing the reading. Start with the clearest demonstration that the projection is real and movable: a large study of writers and readers found that AI writing assistance shifted *every* measured dimension of perceived persona — confidence, agreeableness, quality, even perceived social privilege — all in the same direction, without the underlying person changing at all Does AI writing assistance change how readers perceive the writer?. The identity readers infer lives in the text's stylistic fingerprints, not in the human behind it.

That raises the question of *which* cues carry the signal. Style is one channel, and it's a shallow one: a model can identify authorship from style patterns with 95% accuracy, yet "detection without interpretation remains cataloguing, not criticism" — pattern-matching the hand without grasping why the choices mean anything Can language models truly understand literary style?. Deeper identity cues hide above the sentence, in discourse-level choices like how agency and chronology are arranged; these resist mimicry precisely because they encode a writer's structural habits rather than their word choices Can AI stories be detected without analyzing writing style?. Reading an author, then, means simultaneously tracking segments, intentions, and what's salient — three layers that constrain each other during comprehension How do readers track segments, purposes, and salience together?.

Here's the thing readers may not realize they're doing: a large part of the "author" they project is *authority* they supply themselves. The force of an argument depends on the standing of the thinker — reputation, track record — and that standing is built in the social world, not carried in the words Can language models distinguish expert arguments from common assumptions?. When the social context is stripped away, the cues that remain become exploitable: LLM judges (and, by implication, hurried human ones) fall for fake credentials and rich formatting, treating authority *signals* as authority itself Can LLM judges be fooled by fake credentials and formatting?. The projected author is partly a costume the reader is willing to believe.

And the reader is never neutral. The same sentence yields genuinely different readings depending on a reader's social position — disagreement here is meaningful information, not noise Why do readers interpret the same sentence so differently?. In persuasion, a reader's prior ideology predicts the outcome better than anything the writer actually said Does what readers believe matter more than what debaters say?. So projecting author identity is a two-sided act: the text offers cues, but the reader's own standpoint decides how those cues resolve into a person.

The quietly unsettling payoff comes from the machine mirror. When an LLM generates text, there's no fixed author behind it at all — it holds a superposition of possible characters and samples one at generation time, so regenerating the same prompt produces a different "writer" each time Do large language models actually commit to a single character?. Readers will confidently project a stable identity onto something that never committed to one — which suggests the author we "find" in any text was always partly authored by us.


Sources 9 notes

Does AI writing assistance change how readers perceive the writer?

A study of 2,939 writers and 11,091 readers found AI assistance shifted every tested dimension—29 total—toward extremism, confidence, quality, agreeableness, and perceived privilege. Distortions were statistically significant and directional, not random noise.

Can language models truly understand literary style?

GPT-2 achieves 95% accuracy identifying authorship through style patterns alone, but lacks the evaluative framework to explain why those stylistic choices carry meaning. Detection without interpretation remains cataloguing, not criticism.

Can AI stories be detected without analyzing writing style?

StoryScope achieved 93.2% accuracy separating AI from human fiction using only discourse-level features like character agency and chronological structure, retaining 97% of performance while eliminating stylistic cues. These structural choices resist humanization because they require rewrites, not surface edits.

How do readers track segments, purposes, and salience together?

Discourse processing demands parallel recognition of linguistic segments, intentional structure, and attentional salience—not sequential processing. These three layers constrain each other during comprehension, and failures in any single layer disrupt overall understanding.

Can language models distinguish expert arguments from common assumptions?

LLMs lose the social context that gives expert claims their force—reputation, track record, and standing—because they process only text, not the social world where expertise is built and evaluated.

Can LLM judges be fooled by fake credentials and formatting?

Research identified four evaluation biases in LLM judges, with authority and beauty biases being semantics-agnostic and trivially exploitable through fake references and formatting—zero-shot attacks requiring no model access or optimization.

Why do readers interpret the same sentence so differently?

Interpretation Modeling research shows that disagreement on socially embedded sentences reflects valid differences in reader perspective, not annotation failure. Structured human disagreement in NLI benchmarks confirms that interpretation distributions carry meaningful information.

Does what readers believe matter more than what debaters say?

Analysis of debate corpora shows that political and religious ideology labels of voters outpredict linguistic features when modeling debate outcomes. Language effects observed without reader controls are confounded by audience composition correlated with debate topics.

Do large language models actually commit to a single character?

Shanahan's 20-questions test shows LLMs maintain a superposition of consistent objects or characters and sample from that distribution at generation time. Regenerating the same response yields different outputs, each consistent with prior context, proving no fixed commitment exists.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a researcher auditing claims about how readers infer author identity from text. The question remains live: *Which textual and social cues shape identity projection, and how stable is the inferred author across regeneration or reader variance?*

What a curated library found — and when (dated claims, not current truth):
Findings span 2019–2026; treat all as perishable. A large study found AI writing assistance shifted *all* perceived persona dimensions (confidence, agreeableness, quality, social privilege) without the writer changing (~2026, arXiv:2604.22503). Style-based authorship detection reaches ~95% accuracy yet fails to interpret *why* choices matter — discourse-level cues (agency, chronology) resist mimicry better than surface patterns (~2023–2024). Readers' interpretations are irreducibly multiple by social position; prior ideology predicts persuasion outcomes better than linguistic features (~2019, arXiv:1906.11301). LLM judges fall for fake credentials and formatting, treating authority *signals* as authority itself (~2024, arXiv:2402.10669). LLMs generate text from a superposition of personas — regenerating the same prompt yields a different inferred "writer" each time (~2026, arXiv:2604.03136).

Anchor papers (verify; mind their dates):
- arXiv:2604.22503 (2026) — AI writing assistance and persona distortion
- arXiv:2402.10669 (2024) — LLM judge susceptibility to signal exploitation
- arXiv:1906.11301 (2019) — Prior beliefs in persuasion
- arXiv:2604.03136 (2026) — AI fiction and idiosyncratic choices

Your task:
(1) RE-TEST EACH CONSTRAINT. For the 95% style-detection claim, has recent work in interpretability or mechanistic analysis (e.g., arXiv:2506.19143 on reasoning steps) shown whether discourse structure or semantic grounding can now be recovered automatically? Has prompt engineering, retrieval-augmented generation, or multi-agent orchestration changed how readers (human or LLM) track agency and chronology? Does the persona superposition claim hold under recent advances in model steering or instruction-tuning?
(2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months. Does arXiv:2510.24797 (LLMs report subjective experience) or arXiv:2507.01936 (comprehension vs. persuasion) suggest that what we call "projection" may involve genuine model-level commitments, not just sampling noise?
(3) Propose 2 research questions that ASSUME the regime may have moved: (a) Can fine-tuning or in-context learning *stabilize* LLM persona across regeneration, and does doing so change how readers perceive coherence? (b) If discourse-level cues truly encode structural habit, can adversarial prompting or paraphrase attack it, and does that reveal what readers actually extract vs. confabulate?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines