INQUIRING LINE

How does the Word Novelty Rate metric measure convention formation?

This reads as a question about a specific measurement — Word Novelty Rate, presumably tracking how often agents introduce new words versus reusing established ones as a signal that a shared convention is taking hold — and the honest answer is that the corpus doesn't hold that paper, though it has sharp things to say about why a metric like that would behave the way it does.


This explores the Word Novelty Rate metric and how a falling rate of newly-coined words would mark the moment a group settles into a shared convention. None of the notes in the collection cover that metric or convention formation directly — so if you came looking for the specific result, it isn't here yet. What the corpus does have is a cluster of work on the deeper question your question rests on: when you watch a number go up or down and call it 'emergence' or 'convention,' how much of that is the world and how much is the ruler?

That distinction turns out to be load-bearing. The strongest cautionary tale is the finding that LLM 'emergent abilities' largely dissolve when you swap a discontinuous metric for a continuous one — the sharp capability jump was a measurement choice, not a behavioral change Are LLM emergent abilities real or measurement artifacts?. The same lesson repeats for the exploration–exploitation trade-off, which looks fundamental at the token level but nearly vanishes when you measure in hidden-state space instead Is the exploration-exploitation trade-off actually fundamental?. A Word Novelty Rate is exactly this kind of construct: a curve whose shape — gradual drift versus a clean phase transition into 'convention' — may say as much about how you binned and counted as about what the agents actually did.

There's also a substantive trap waiting for any novelty-counting metric. Frequent words tend to be the abstract, general ones (hypernyms outrank hyponyms), and LLMs are biased toward frequent words — so 'fewer novel words' can mean a population is genuinely converging on shared terms, or it can mean everyone is drifting toward bland, common phrasing that erases specificity Does word frequency correlate with semantic abstraction?. Convention and collapse can produce the same falling curve. Relatedly, the ideation work shows that high individual novelty can coexist with the whole system clustering into a narrow region Why do LLMs generate novel ideas from narrow ranges? — a warning that a single aggregate rate can hide whether variety is actually shrinking underneath it.

If you want to go deeper on the 'how would you measure something semantic rather than lexical' angle, the semantic-entropy approach is the most transferable idea here: instead of counting surface tokens, it clusters outputs by meaning and computes entropy over those clusters Can we detect when language models confabulate?. A convention-formation metric built that way would track when meanings stabilize, not just when novel word-strings stop appearing — a sharper instrument than raw novelty rate, and one that sidesteps the frequency trap above.

So the thing you didn't know you wanted to know: the interesting risk in a Word Novelty Rate isn't whether it can detect convention, but whether the convergence it detects is agreement or just regression to the mean — and the corpus's recurring theme is that the only way to tell is to check whether your result survives a change of metric.


Sources 5 notes

Are LLM emergent abilities real or measurement artifacts?

Sharp, unpredictable capability transitions vanish when using continuous metrics instead of discontinuous ones. The same model outputs show smooth predictable improvement with scale, suggesting emergence is a measurement choice rather than a real behavioral change.

Is the exploration-exploitation trade-off actually fundamental?

Hidden-state analysis using Effective Rank metrics shows near-zero correlation between exploration and exploitation, revealing the trade-off emerges only at token level. VERL demonstrates simultaneous enhancement achieving 21.4% accuracy gains on Gaokao 2024.

Does word frequency correlate with semantic abstraction?

WordNet analysis shows hypernyms (general concepts) occur more frequently than hyponyms (specific ones). Combined with LLMs' frequency bias, this means preferring common paraphrases systematically drifts toward abstraction, erasing expert-level specificity.

Why do LLMs generate novel ideas from narrow ranges?

LLM-generated research ideas are rated individually novel but lack diversity, clustering in narrow generative regions. Combined with LLM self-evaluation failures, this limits the possibility space explored compared to human ideation across different conceptual territories.

Can we detect when language models confabulate?

Clustering sampled answers by bidirectional entailment and computing entropy over semantic clusters catches confabulations invisible at token level. This self-referential approach works across tasks without task-specific training data.

Research prompt for your LLMexpand ↓

Copy into ChatGPT or Claude to take this line of inquiry further — it asks the model to find newer work and re-test which earlier constraints still hold.

You are a research analyst re-testing whether Word Novelty Rate remains a viable proxy for convention formation in multi-agent LLM systems. The question: does a falling novelty rate reliably distinguish genuine semantic convergence from statistical regression to frequent, generic phrasing?

What a curated library found — and when (dated claims, not current truth):
Findings span 2023–2026; treat as perishable constraints.
- 'Emergent' capability jumps often dissolve when you swap discontinuous metrics for continuous ones — the discontinuity was a measurement choice, not a behavioral shift (2023).
- Frequency-biased metrics conflate convergence with semantic collapse: LLMs drift toward hypernyms and generic tokens, so falling novelty can mask loss of specificity rather than signal true agreement (2025).
- High individual novelty can coexist with system-level clustering into narrow semantic regions — aggregate novelty rates hide distributional collapse (2025).
- Semantic entropy (clustering outputs by meaning, computing entropy over clusters) detects stability of *meanings*, not just novel word-strings, and sidesteps the frequency trap (derived from 2024–2025 work).
- Token-level metrics often vanish or invert when measured in hidden-state space instead, suggesting surface lexical rates may not track the cognitive event you care about (2025).

Anchor papers (verify; mind their dates):
- arXiv:2304.15004 (2023): metric artifacts in emergent abilities
- arXiv:2505.21011 (2025): LLMs as frequency pattern learners
- arXiv:2509.23808 (2025): exploration–exploitation as token-level artifact
- arXiv:2604.03238 (2026): preferences as social-science measurement problem

Your task:
(1) RE-TEST EACH CONSTRAINT. For novelty-rate metrics specifically: have newer training methods (contrastive learning, diversity-aware RL), longer context windows, or multi-agent orchestration (memory, retrieval-augmented generation) relaxed the frequency-bias trap or made surface-token metrics correlate better with semantic convergence? Does the metric survive a shift to meaning-level clustering, or does it collapse? Separate the durable question (when do groups truly settle into shared convention?) from the perishable limitation (can raw word novelty detect it?).
(2) Surface the strongest CONTRADICTING or SUPERSEDING work from the last ~6 months: are there papers showing novelty-rate metrics DO correlate with semantic stability, or newer metrics that have replaced it?
(3) Propose 2 research questions that ASSUME the regime may have moved: e.g., (a) does semantic-entropy-based convention detection align with human judges' intuitions of 'shared understanding'? (b) can multi-agent orchestration (e.g., caching, memory-sharing) deliberately suppress novelty while maintaining semantic diversity?

Cite arXiv IDs; flag anything you cannot ground in a real paper.

Next inquiring lines