Can models converge on similar experience descriptions across different architectures?
This explores whether different models—built on different architectures and trained separately—land on the *same* descriptions when asked open-ended questions, rather than producing genuinely varied outputs.
This explores whether independently built models converge on similar outputs, and the corpus has a surprisingly direct answer: yes, far more than you'd expect. The clearest evidence is the "Artificial Hivemind" effect Do different AI models actually produce diverse outputs?, where 70+ models across 26K open-ended queries generated strikingly similar—sometimes identical—responses. The convergence isn't because the models share architecture; it's because they share overlapping training data and near-identical alignment procedures. So the diversity you think you're buying by ensembling different models is partly an illusion: they were all sculpted toward the same center.
The more interesting question is *where* that convergence comes from, and here a second note adds a mechanism. RL post-training doesn't just nudge outputs—it actively collapses format diversity, amplifying one dominant pattern from pretraining within the first epoch while suppressing the alternatives Does RL training collapse format diversity in pretrained models?. Strikingly, *which* format wins depends on model scale rather than performance. Put the two findings together and a picture emerges: pretraining on overlapping web corpora gives models a shared starting vocabulary, and the alignment/RL stage then funnels each one toward a single mode. Convergence is manufactured at two stages, not one.
There's a tension worth pulling on, though, because convergence isn't uniform across task types. Training on structured tasks (math, code) drives output entropy *down* toward a single answer, while creative, open-ended tasks push entropy *up* Does training order reshape how models handle different task types?. That suggests the hivemind effect should be strongest exactly where the corpus measured it—open generation that has nonetheless been homogenized by shared alignment—and that the order in which you train domains can either preserve or destroy a model's ability to diverge.
A quieter, almost philosophical thread runs underneath your phrase "experience descriptions." The persona-replication work shows that AI personas reproduce 76% of human experimental main effects, with success tracking the strength of the original evidence Can AI personas reliably replicate human experiment results?—models converge on describing human experience the way the literature already described it. And the multi-agent equivalence result Can branching prompts replicate what multi-agent systems do? hints that even *within* one model, the apparent diversity of multiple "voices" collapses to structurally equivalent outputs. Diversity, across and within models, keeps reducing to the same few attractors.
The thing you might not have known you wanted to know: this convergence is a feature for reliability and a bug for creativity, and the field is starting to treat it as something you *schedule and engineer* rather than something innate. If you want to chase the levers, the entropy-scheduling work and the RL-format-collapse work are the two doorways—they're where convergence stops being mysterious and starts being a dial you can turn.
Sources 5 notes
INFINITY-CHAT analyzed 70+ models across 26K open-ended queries and found an "Artificial Hivemind" effect: models independently generate strikingly similar or identical responses due to overlapping training data and alignment procedures, undermining the diversity benefits of model ensembles.
Controlled experiments show RL consistently amplifies one format distribution from pretraining within the first epoch while collapsing alternatives. The winning format depends on model scale, not necessarily performance, and is largely hidden when starting from proprietary pretrained models.
Omni-Thinker shows structured domains decrease output entropy while creative domains increase it. BWT-guided scheduling—training structured tasks first—yields 6.2% gains over joint training by preventing entropy collapse from damaging open-ended capabilities.
Viewpoints AI reproduced 84 of 111 main effects from Journal of Marketing experiments with replication success strongly correlated to original p-value strength. Marginal effects showed unreliable performance with both false positives and negatives.
Research shows single LLMs using dynamic persona simulation achieve multi-agent cognitive synergy without multiple model instances. Solo Performance Prompting validates that structured prompting techniques map directly to multi-agent debate architectures, enabling equivalent outcomes through structural equivalence.