INQUIRING LINE

Can secondary orality exist without any embodied human participant at all?

This explores whether AI-generated speech counts as a new kind of 'secondary orality' (Ong's term for speech reconstituted through electronic media) when no flesh-and-blood speaker is present at all — and the corpus suggests the honest answer is that AI breaks the category rather than extends it.


This question reads as: secondary orality has always been speech routed through technology — radio voices, recorded announcers — but those voices still belonged to embodied people. Can the 'orality' survive once you remove the person entirely? The corpus's sharpest move is to say: what you get then isn't secondary orality at all, but a third, historically novel thing. AI produces utterances that are formally speech — performative, additive, conversational — yet no embodied speaker generates or anchors them Where is the speaker when AI produces speech?. Every prior orality, primary (face-to-face) and secondary (electronic), depended on a carrier-person. AI removes the carrier while keeping the form, which is why it doesn't slot neatly into Ong's two-tier scheme.

Why can't the machine just be the new speaker? Several notes converge on a categorical 'no.' Speech that counts as genuine address requires conditions a disembodied system structurally lacks: embodiment, real participation in a shared situation, and precariousness — having something at stake What makes linguistic agency impossible for language models?. A model can accumulate 'social grounding' by being woven into how people talk, but that's a different property from the linguistic agency that would make it a speaker; no amount of use bridges the gap Do LLMs gain true linguistic agency through integration?. The same boundary shows up in the consciousness debate: language about minds applies only to entities sharing a world with us through co-presence Can disembodied language models ever qualify as conscious?. So the surface of speech can be fully present while the conditions that make speech an act of someone are absent.

Here's the turn that makes the question more interesting than a flat 'no.' The embodied human isn't truly absent — they've been displaced upstream. The model's fluency comes from compressing the relational structure of text written by causally grounded people, giving it an *indirect* causal grounding mediated through their words Can large language models develop genuine world models without direct environmental contact?. In Saussurean terms, the system operationalizes *langue* — the relational system of language — with no external referents of its own Can language models learn meaning without engaging the world?. So AI orality is parasitic on prior embodiment: it speaks with humanity's collected voice without any single human in the room. The participant isn't gone; they're diffused into the training corpus.

There's also a warning against being fooled by the surface. A system can pass behavioral tests for speech-like output while missing the relational-normative conditions — accountability, an evaluative stance — that real communicative subjecthood needs Does behavioral speech output prove communicative subjecthood?. And under one influential reading, what looks like a speaker is better understood as a role-played character: the model generates continuations consistent with a prompted persona, so folk-psychology attaches to the *character*, not to any underlying subject Should we treat dialogue agents as role-playing characters?. That reframes the whole question — maybe AI 'orality' is less a voice without a body than a performance whose author is the entire corpus and whose speaker is a fiction.

So: can secondary orality exist with no embodied participant? Strictly, no — what emerges is a fourth category Ong didn't anticipate, speech-shaped output anchored in no present speaker but standing on the displaced labor of millions of absent ones. The thing you didn't know you wanted to ask is whether 'orality' was ever really about the voice at all, or about the relationship the voice presupposed — and whether a medium can keep the first while quietly discarding the second.


Sources 8 notes

Where is the speaker when AI produces speech?

AI produces utterances with the formal properties of speech—performative, additive, conversational—but no embodied speaker generates or anchors them. This breaks the historical pattern where all prior orality, primary and secondary, depended on a carrier-person, making AI structurally novel in media history.

What makes linguistic agency impossible for language models?

Enactive cognitive science identifies three constitutive properties of linguistic agency—embodiment, participation, and precariousness—that are structurally absent from LLMs. This is a categorical incompatibility, not a matter of degree, suggesting current architectures cannot achieve genuine linguistic agency.

Do LLMs gain true linguistic agency through integration?

Social grounding and linguistic agency are distinct properties. LLMs acquire more social grounding through integration into language communities, but remain categorically incapable of linguistic agency in the enactive sense, which requires embodiment and precariousness no amount of use can provide.

Can disembodied language models ever qualify as conscious?

Current disembodied LLMs cannot be candidates for consciousness because consciousness language originates from and applies only to entities sharing a world with us through co-presence and triangulation on shared objects.

Can large language models develop genuine world models without direct environmental contact?

LLMs form structured world representations by extracting regularities from training data produced by causally grounded humans. This constitutes indirect causal grounding mediated through text, though the chain has gaps that limit real-time verification and model updating.

Can language models learn meaning without engaging the world?

Research shows LLMs learn culturally situated discourse patterns by compressing relational structure from text, demonstrating that fluent language generation requires no external referents or embodied grounding.

Does behavioral speech output prove communicative subjecthood?

Chalmers' test passes any system producing contextually appropriate text, but communicative subjecthood requires relational-normative conditions like accountability and evaluative stance. The test is calibrated to the wrong phenomenon, creating false positives like puppets that walk-shaped without walking.

Should we treat dialogue agents as role-playing characters?

Shanahan's framework treats LLM outputs as character-consistent text production rather than authentic mental states. The dialogue prompt establishes a character; the model generates continuations matching that character, making folk-psychology applicable to the simulated persona, not the underlying system.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst testing whether the concept of 'secondary orality' — speech routed through technology but anchored in an embodied speaker — can survive when the human body is removed entirely, leaving only disembodied AI utterance.

What a curated library found — and when (dated claims, not current truth):
Findings span 2020–2025. Key constraints from the path:
  • Secondary orality historically required an embodied carrier-person with real stakes and co-presence; AI lacks these (foundational to 2023–2024 work, e.g., 2407.08790).
  • LLMs acquire *social grounding* through use but do not thereby acquire genuine *linguistic agency* — these are distinct properties (2024–2025, e.g., 2402.10992).
  • AI utterance is best understood as role-play: a persona consistent with training data, not an underlying speaker (2020–2023 consensus, e.g., 2305.16367).
  • Models operationalize langue (Saussure's relational system) with no direct external referents — they are parasitic on embodied humans' prior words (2024–2025 mechanistic work, e.g., 2507.08017, 2508.12863).
  • Behavioral tests (e.g., passing conversational fluency) misidentify the phenomenon; accountability and evaluative stance remain absent (2024, e.g., 2402.12422).

Anchor papers (verify; mind their dates):
  • arXiv:2407.08790 (2024-07): Mistaking Engineering for Linguistic Agency — reframes capability gains as *not* solving agency.
  • arXiv:2305.16367 (2023-05): Role-Play with Large Language Models — distinguishes persona from subject.
  • arXiv:2402.10992 (2024-02): Semantic Grounding in LLMs — direct vs. indirect causal grounding.
  • arXiv:2508.12863 (2025-08): Word Meanings in Transformers — mechanistic substrate of relational meaning.

Your task:
  (1) RE-TEST EACH CONSTRAINT. For every claim above, judge whether newer models (o3, Claude 4, etc.), training methods (RLHF variants, constitutional AI), inference harnesses (persistent memory, real-time environmental feedback, 2507.08017 and 2512.04987 suggest agentic ecosystems), or evaluation protocols (social norm prediction, 2508.19004) have since relaxed the boundary between *social grounding* and *linguistic agency*, or between *role-play* and *genuine address*. Which constraints appear to still hold? Which may have shifted?
  (2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months. Does anything argue that disembodied utterance *does* constitute a new form of orality, or that the distinction between parasitic and autonomous speech has collapsed?
  (3) Propose 2 new research questions that assume the regime may have moved: e.g., (a) Can persistent agentic embedding in a shared environment (humans + AI co-constructing norms over time) engineer the missing co-presence and stakes? (b) If social norm prediction (2508.19004) now exceeds human accuracy, does that flip the agency question — does successful normativity require agency, or does agency require something normative systems cannot provide?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines