Can imaginary listeners reduce dialogue agent contradictions?
Does simulating how an imaginary listener would interpret an utterance help dialogue agents maintain persona consistency without extra training? This explores whether pragmatic self-monitoring at generation time can replace costly supervised approaches.
Persona-based dialogue agents routinely contradict their own stated attributes. Previous solutions either require Natural Language Inference (NLI) labels for training or attach extra trained modules. The "Will I Sound Like Me?" approach (2020) takes a different path: it endows existing agents with public self-consciousness at inference time through an imaginary listener, inspired by social cognition and pragmatics.
The mechanism uses the Rational Speech Acts (RSA) framework. Before generating an utterance, the agent simulates how a listener would interpret it — specifically, whether the listener could distinguish this speaker's persona from a distractor persona based on the utterance. Utterances that would not help identify the speaker (because they are generic or contradictory) are suppressed. The agent learns to ask: "Would I sound like me if I said this?"
The framework extends beyond persona to context consistency in general dialogue. The distractor selection — which alternative persona to contrast against — can be learned rather than manual or random.
This connects to Why does supervised learning fail to enforce persona consistency? as a complementary approach: offline RL punishes contradiction during training while RSA prevents it during inference. The RSA approach requires no additional training data but operates at generation time, adding computational cost per utterance. The trade-off is training-time correction vs. inference-time self-monitoring — both address the same root cause that generative models are never explicitly rewarded for consistency.
The deeper insight is that persona consistency is fundamentally a pragmatic property, not a semantic one. It is about how utterances function in identifying a speaker, not just about logical compatibility of facts.
Inquiring lines that use this note as a source 31
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- At what scale does persona distortion become a threat to public discourse?
- How does behavioral stickiness distinguish realized from pretended personas?
- What makes sincerity impossible without a coherent first-person perspective?
- How does persona consistency affect coherence in simulated dialogue?
- How does the superposition view change the folk-psychology interpretation of dialogue?
- Do synthetic personas maintain consistency across multiple conversations?
- Can synthetic personas achieve emotional connection with creators?
- Do dialogue agents have authentic voice agency or beliefs of their own?
- How does Shanahan's simulator model explain first-person pronoun consistency in dialogue agents?
- Does inner subjective experience matter for discourse participation?
- How should task-oriented and socially-oriented dialogue acts receive different training signals?
- Can dialogue agents be reliable but still feel inflexible or cold?
- Can offline reinforcement learning teach models to avoid persona contradictions?
- What training objectives would actually improve persona consistency at scale?
- How can training methods enforce persona consistency without supervised learning penalizing it?
- Can persona consistency coexist with relevant dialogue in personalized conversation?
- How does distractor persona selection affect consistency enforcement in dialogue?
- Why is persona consistency a pragmatic property rather than semantic?
- Can offline RL and pragmatic inference together improve dialogue agent reliability?
- What downstream consequences follow if dialogue agent personas are realized?
- Can treating simulated users as trainable agents reduce persona consistency drift?
- What dialogue content gaps remain after review augmentation?
- How do contextual characteristics like emotional state shape dialogue authenticity?
- Does persona assignment alone produce repetitive dialogue without situational grounding?
- Can a virtual instance be individuated from its conversational context?
- Can a system without an addressee ever truly tell a joke?
- How does entrainment between speaker and listener build mutual scaling?
- How do persona and context multiply to improve synthetic dialogue diversity?
- Can statistical token processing create the accountability needed for dialogue?
- How should persona prompts be used if not for accuracy?
- How do persona consistency and contextual relevance trade off in personalized dialogue systems?
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Will I Sound Like Me? Improving Persona Consistency in Dialogues through Pragmatic Self-Consciousness
- Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning
- From Persona to Person: Enhancing the Naturalness with Multiple Discourse Relations Graph Learning in Personalized Dialogue Generation
- The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models
- Building Persona Consistent Dialogue Agents with Offline Reinforcement Learning
- WHEN TO ACT, WHEN TO WAIT: Modeling Structural Trajectories for Intent Triggerability in Task-Oriented Dialogue
- Can AI Have a Personality? Prompt Engineering for AI Personality Simulation: A Chatbot Case Study in Gender-Affirming Voice Therapy Training
- CloChat: Understanding How People Customize, Interact, and Experience Personas in Large Language Models
Original note title
pragmatic self-consciousness through an imaginary listener reduces persona contradiction without additional training — Rational Speech Acts framework enforces consistency by simulating how utterances would be interpreted