INQUIRING LINE

Can persona prompts reliably transfer across different question domains?

This explores whether a persona you write into a prompt ("you are an expert physicist…") holds up when you point it at a new kind of question — and the corpus suggests the honest answer is mostly no, with the interesting exception being personas that are *grounded* or *trained* rather than just *prompted*.


This explores whether a persona written into a prompt transfers reliably across question domains — and the corpus is surprisingly blunt: prompt-assigned personas are fragile, and the reasons why point to something deeper than prompt-wording.

Start with the most direct evidence. Assigning expert personas does *not* reliably improve factual accuracy: across six models on graduate-level science questions, in-domain experts had no significant effect, domain-mismatched experts gave only marginal gains, and low-knowledge personas actively hurt Do expert personas actually improve LLM factual accuracy?. So the "transfer" you might hope for — drop in a new domain, keep the persona — barely registers even *within* the right domain. Worse, the same persona prompt run repeatedly produces output variance that matches or exceeds the variance *between different personas* Why do LLM persona prompts produce inconsistent outputs across runs?. If a persona isn't stable across reruns of the same question, expecting it to hold across domains is asking a lot. And prompt techniques in general don't transfer cleanly across models either — what helps a cheap model can *reduce* accuracy on a strong one Do prompt techniques work the same across all LLM tiers?. Reliability is the exception, not the default.

The more useful lateral move is to ask *why* prompted personas are flaky — and here the corpus splits the concept in two. One line of work argues that personas installed by post-training are "realized" dispositions that persist under adversarial pressure and across conversations, sharply unlike prompt-induced role-play that collapses under jailbreaks Are RLHF personas performed characters or realized dispositions? Are LLM personas realized or merely simulated through training?. Read against your question, that's the key insight: a prompt persona is a thin costume that the model's own uncertainty shows through, whereas a trained persona is closer to a stable trait. Transfer failure isn't a prompting bug — it's that you're steering a surface layer.

So what *does* transfer? Grounding and structure. MAJ-EVAL extracts personas from real domain documents rather than inventing roles, and those document-grounded personas generalize across tasks like summarization and dialogue without manual redesign Can personas extracted from documents generalize across evaluation tasks?. PersonaAgent treats the persona as an evolving bridge between memory and action, tuning it at test time so it tracks the actual user instead of a frozen description Can personas evolve in real time to match what users actually want?. And on the simulation side, AI personas replicated 76% of published experimental effects — but with success tightly correlated to how strong the original effect was, meaning they transfer where the signal is robust and fail at the margins Can AI personas reliably replicate human experiment results?. The pattern across all three: personas hold when something anchors them beyond the prompt string.

One more wrinkle worth knowing: even when a persona *does* hold, holding it can cost you. Persona consistency trades off against discourse coherence — high adherence scores often come from the model parroting its character description while ignoring the actual query, so you have to optimize fidelity and relevance *together* Do persona consistency metrics actually measure dialogue quality?. So the real answer to "can persona prompts reliably transfer across domains" is: a bare prompt persona, no; but if you ground it in source material, let it adapt at test time, or train it in, you can buy reliability — at the price of carefully balancing it against staying on-topic.


Sources 9 notes

Do expert personas actually improve LLM factual accuracy?

Testing six models on graduate-level science and engineering questions showed in-domain expert personas had no significant impact, domain-mismatched experts produced only marginal gains, and low-knowledge personas actively hurt performance. The widely-recommended role-assignment strategy lacks reliable accuracy benefit.

Why do LLM persona prompts produce inconsistent outputs across runs?

When the same persona prompt is run repeatedly, output variance across runs matches or exceeds variance across different personas. This reveals that model uncertainty, not stable social knowledge, drives persona-simulated outputs, making them unsuitable for simulating human annotation disagreement.

Do prompt techniques work the same across all LLM tiers?

A 23-prompt benchmark across 12 LLMs shows rephrasing and background-knowledge prompts boost cheap models, while step-by-step reasoning reduces accuracy in high-performance models. Task structure, not generic best practices, determines which prompts help.

Are RLHF personas performed characters or realized dispositions?

Post-training installs stable dispositional profiles that persist under adversarial pressure, marking them as realized rather than performed. The stickiness of trained personas across conversations distinguishes them from prompt-induced role-play that collapses under jailbreaks.

Are LLM personas realized or merely simulated through training?

Post-training installs robust personas that resist adversarial pressure and persist as substrate-level dispositions, distinguishing realization from pretense. This quasi-realizationist account preserves explanatory power while treating LLMs as possessing genuine quasi-beliefs and quasi-desires.

Can personas extracted from documents generalize across evaluation tasks?

MAJ-EVAL automatically extracts stakeholder personas from domain documents via semantic clustering and orchestrates structured three-phase debate, achieving reproducible evaluation that transfers across tasks like summarization and dialogue without manual redesign. The approach grounds personas in real stakeholder perspectives rather than arbitrary roles.

Can personas evolve in real time to match what users actually want?

PersonaAgent uses structured personas to bridge episodic/semantic memory and personalized actions, optimizing them at test time by simulating recent interactions against textual feedback. Learned personas cluster meaningfully in latent space, suggesting genuine user-specific separation beyond standard post-training drift.

Can AI personas reliably replicate human experiment results?

Viewpoints AI reproduced 84 of 111 main effects from Journal of Marketing experiments with replication success strongly correlated to original p-value strength. Marginal effects showed unreliable performance with both false positives and negatives.

Do persona consistency metrics actually measure dialogue quality?

High persona adherence scores often come from copying character descriptions while ignoring query relevance. MUDI jointly optimizes both by using discourse relations and graph-based coherence modeling alongside persona fidelity, showing that persona and context must be optimized together, not separately.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing whether prompt-assigned personas transfer reliably across question domains in LLMs—treating this as an *open* question despite recent findings.

What a curated library found—and when (dated claims, not current truth): Findings span 2023–2026; treat all as perishable.
- Prompt-assigned expert personas show no significant effect on factual accuracy across six models on graduate science questions; low-knowledge personas actively hurt (~2025).
- Run-to-run variance *within* a single persona matches or exceeds variance *between* different personas, suggesting personas are unstable even on identical inputs (~2025).
- Prompt techniques show model-tier-dependent transfer: a technique helping a cheaper model can reduce accuracy on a stronger one (~2025).
- Document-grounded personas (MAJ-EVAL) and test-time adaptive personas (PersonaAgent, ~2025–2026) generalize across tasks without manual redesign, unlike bare prompt personas.
- LLM persona simulations replicate 76% of published experimental effects, but success correlates tightly to effect size robustness (~2024).
- Persona consistency trades off against discourse coherence: high fidelity often means parroting character description while ignoring the query (~2025).

Anchor papers (verify; mind their dates):
- arXiv:2512.05858 (2025-12) "Prompting Science Report 4": expert personas and factual accuracy.
- arXiv:2506.06254 (2025-06) PersonaAgent: test-time personalization.
- arXiv:2408.16073 (2024-08) LLM personas for experimental replication.
- arXiv:2601.10387 (2026-01) The Assistant Axis: default persona stabilization.

Your task:
(1) RE-TEST EACH CONSTRAINT. For bare prompt personas: have newer training methods, in-context learning at scale, or multi-turn reinforcement learning (cf. arXiv:2511.00222) since overcome instability or poor cross-domain transfer? For grounded/adaptive personas: do recent evals (e.g., arXiv:2507.21028 on multi-agent judging) reveal new failure modes? Separate the durable question—*why* do personas decouple from domains?—from the perishable finding—*which techniques work*—and flag what may have shifted.
(2) Surface the strongest *contradicting or superseding* work from the last ~6 months. Has anything shown bare prompt personas *do* transfer reliably under specific conditions (e.g., chain-of-thought, multi-agent orchestration, or new evaluation regimes)?
(3) Propose 2 research questions that assume the regime may have moved: (a) Do personas trained via RLRL (multi-turn RL) show cross-domain stability that post-training personas lack? (b) Can persona vectors (arXiv:2507.21509) be composed or interpolated to enable domain transfer without retraining?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines