INQUIRING LINE

What property must remain constant to individuate an LLM across infrastructure changes?

This explores the philosophical puzzle of what 'counts' as the same LLM when the physical machinery underneath keeps shifting — and the corpus's surprising answer is that the thing you can hold constant isn't in the hardware or even the model weights at all.


This explores what stays fixed enough to say "this is the same LLM I was just talking to" when the infrastructure underneath changes — and the corpus pushes back hard on the intuitive answers. The instinctive guess is hardware: surely an LLM is the machine running it. But Can we identify an LLM interlocutor with a single hardware instance? dismantles that. Load-balancing and model-parallelism scatter a single conversation across multiple physical instances, while batching funnels many conversations through one. There's no stable one-to-one map between a chat and a chip, so hardware can't be the thing that individuates.

The next guess is the weights — the model itself. But What actually specifies a virtual instance in conversation? argues the model alone doesn't specify which instance you're talking to either. A "virtual instance" is jointly produced: the conversational context — the language built up between you and the system — is what actually picks out this particular interlocutor. Persistence is distributed across conversation, infrastructure, and weights rather than sitting in any one of them. So the property that must stay constant is the conversation, the accumulated token string, not the substrate computing over it.

This lands harder once you see what the model has none of. Does an LLM have anything that persists between conversations? points out that humans carry a continuous biological body that preserves the residue of an interaction even while you sleep. The LLM has no such carrier; the virtual instance is reconstituted from stored text every single time, which is why a resumed conversation and a brand-new one are structurally identical. Identity isn't stored in the thing — it's re-assembled from the transcript. The transcript is the individuating constant precisely because nothing else survives the gap.

And the reason the conversation has to do all this work is that there's no fixed "character" underneath to fall back on. Does an LLM commit to a single character or maintain many? shows the model holds a superposition of many consistent personas that only narrows as the conversation accumulates — each reply samples from a distribution. Even pinning the machinery doesn't pin the identity: Does setting temperature to zero actually make LLM outputs reliable? notes that zero temperature and a fixed seed just replay one draw from that distribution, not a stable self. The narrowing is done by the context, which is one more reason individuation rides on the conversation rather than the configuration.

The thing worth walking away with: an LLM's identity is less like a person you revisit and more like a document you reload. What you keep constant to keep "the same" model isn't anything physical — it's the text you and it have written together. Swap the GPUs, re-shard the weights across new hardware, and as long as the conversational context is fed back in, the same virtual instance reappears. Change the context, and even on identical hardware, you're talking to someone else.


Sources 5 notes

Can we identify an LLM interlocutor with a single hardware instance?

Load-balancing and model-parallelism route single conversations across multiple hardware instances, while batching routes multiple conversations through one instance. These architectural facts break any stable one-to-one mapping, making hardware an untenable level of individuation.

What actually specifies a virtual instance in conversation?

The conversational context—jointly produced language between human and system—specifies the virtual instance, not any property of the model itself. Persistence is distributed across conversation, infrastructure, and model weights rather than located in the AI.

Does an LLM have anything that persists between conversations?

While humans have a continuous biological-phenomenological substrate that preserves interaction effects during dormancy, LLMs have no analogous carrier. The virtual instance is reconstituted from stored text each time, making resumed and new conversations structurally identical.

Does an LLM commit to a single character or maintain many?

Research shows LLMs don't commit to a single character but instead maintain a probability distribution over many consistent simulacra. Each response samples from this distribution, explaining why regenerations can yield different personalities while remaining consistent with prior context.

Does setting temperature to zero actually make LLM outputs reliable?

Fixed seeds and zero temperature replicate the same output repeatedly, but that output remains one draw from the model's probability distribution. McDonald's omega testing across 100 repetitions reveals that consistency does not equal reliability.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing claims about LLM individuation across infrastructure changes. The question remains open: what property must stay constant to say "this is the same LLM" when hardware, serving, or orchestration shifts?

What a curated library found — and when (dated claims, not current truth):
Findings span 2023–2026. The library argues:
- Hardware instance identity fails: load-balancing and model-parallelism scatter single conversations across multiple physical instances, while batching funnels many conversations through one (2023–2024).
- Model weights alone don't individuate: a "virtual instance" is jointly produced by conversational context + infrastructure + weights; context is what picks out a particular interlocutor (2023–2024).
- Conversation (the accumulated token string) is the individuating constant because the LLM has no biological carrier; the virtual instance is reconstituted from stored text each time, so the transcript is identity (2024–2025).
- LLMs maintain a superposition of personas narrowed only by context accumulation; zero temperature + fixed seed just replay one draw, not a stable self (2024–2025).
- Multi-turn conversation degradation is observable: models get lost in long context, and resuming a conversation reconstructs identity from transcript alone (2025–2026).

Anchor papers (verify; mind their dates):
- arXiv:2305.16367 (2023): Role-Play with Large Language Models
- arXiv:2311.08719 (2023): Think-in-Memory: Long-Term Memory in LLMs
- arXiv:2505.06120 (2025): LLMs Get Lost In Multi-Turn Conversation
- arXiv:2605.30621 (2026): Harness Updating & Self-Evolution Capabilities

Your task:
(1) RE-TEST EACH CONSTRAINT. For every claim above—especially the thesis that conversation (not hardware, weights, or config) individuates—judge whether newer serving stacks (continuous batching, KV cache management, retrieval-augmented identity systems), test-time scaling, in-context learning methods, or agent memory architectures have since RELAXED or OVERTURNED the hypothesis. Separate the durable question (what grounds identity?) from perishable limitations (does context alone suffice, or do trainable adapters, LoRAs, or specialized harnesses now anchor it?). Cite what resolved it.
(2) Surface the strongest DISAGREEMENT or SUPERSEDING work from the last ~6 months: does any recent paper argue that hardware, weight-binding, or deterministic configuration CAN ground identity independently of conversation? Flag contradictions.
(3) Propose 2 research questions that ASSUME the regime may have moved: e.g., do multi-modal context streams or agent-internal state vectors now carry individuation weight that conversation alone cannot? Can identity survive a weights update?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines