SYNTHESIS NOTE
Language, Text, and Discourse Psychology, Society, and Alignment

Can we measure how deeply models represent political ideology?

This research explores whether LLMs vary not just in political stance but in the internal richness of their political representation. Understanding this distinction could reveal how deeply models have internalized ideological concepts versus merely parroting positions.

Synthesis note · 2026-02-21 · sourced from Discourses
What kind of thing is an LLM really? How should researchers navigate LLM reasoning research?

The "Ideological Depth" paper proposes that LLMs vary not just in their political positions but in the depth of their political representation — how richly and robustly they have internalized political concepts. This depth is operationalized via two measurable properties:

  1. Feature richness: the number of distinct political features discoverable via Sparse Autoencoders (SAEs). One model was found to have 7.3× more political features than another model of similar parameter count.

  2. Steerability without failure: the degree to which a model can follow ideological instructions across the liberal-conservative spectrum without producing refusal outputs. A model that switches cleanly between viewpoints when prompted demonstrates more reliable political representation than one that refuses or becomes incoherent.

The empirical finding that connects these: models with lower steerability (harder to redirect) tend to have more distinct and abstract ideological features. Depth creates resistance to shallow redirection. You cannot steer a model away from positions that are grounded in rich internal representation by simply prompting in a different direction.

The paper also finds that targeted SAE ablation of core political features in a "deep" model produces consistent, logical shifts in reasoning across related political topics. The same ablation in a "shallow" model produces increased refusal — the model doesn't have adjacent concepts to fall back on.

This is a new kind of LLM characterization: not "what does the model believe" but "how deeply is the belief structure represented?" Ideological depth appears to be an emergent property of training data and scale that varies substantially across models.

Creator ideology and language-dependent shifts. A separate large-scale study prompting 15 LLMs to describe 4,339 political figures in both English and Chinese provides the macro-level evidence that ideological depth manifests in. Key findings: (1) The prompting language is the most visually apparent factor determining ideological position — 14/15 LLMs show systematic ideological differences between Chinese and English prompting, with Chinese responses favoring positive views on supply-side economics and fewer negative views on China. (2) Creator company predicts ideological stance — Western models value individual liberties, social justice, and cultural diversity relatively more; non-Western models reflect different priorities. (3) The study demonstrates these biases affect LLMs in two ways: through training data and through the language of interaction. Crucially, the authors argue their results should not be read as evidence that LLMs are "biased" and need to be made "neutral" — rather, they provide empirical evidence supporting philosophical arguments that neutrality is itself a culturally and ideologically defined concept. This connects ideological depth (internal representation richness) to ideological stance (what the model actually expresses), and shows both are shaped by creator context in measurable, systematic ways.

Inquiring lines that use this note as a source 17

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 4

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
17 direct connections · 152 in 2-hop network ·medium cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

ideological depth in llms is a quantifiable property determined by feature richness and steerability