Can cognitive science methods unlock how LLMs actually work?
Does Marr's three-level framework—developed to understand biological minds—offer interpretability researchers the structured methodology they need to decode opaque language models?
David Marr's framework — the computational level (what abstract problem is the system solving), the algorithmic level (what representations and processes does it use), and the implementation level (what physical mechanisms realize the computations) — has been the backbone of cognitive science for decades. The argument in Levels of Analysis for Large Language Models is that this framework now imports usefully into LLM interpretability, because the field's problem is structurally the same problem cognitive science has had for 70 years: opaque systems whose behavior is interesting and whose internals resist direct inspection.
The historical asymmetry was that cognitive science had a methodology and few systems to study, while AI had many systems and no methodology for understanding them. The asymmetry inverts now. Cognitive science's accumulated toolkit — behavioral probes, implicit association tests, double-dissociation paradigms, representational similarity analysis, causal interventions — was developed for one kind of mind and can be redeployed for another. The methodology was always more general than its initial object.
The Marr framework does specific work in this redeployment. The computational level reframes interpretability questions around the abstract problem the LLM is solving (next-token prediction with learned objectives), independent of how. The algorithmic level surfaces the representations and processes — circuits, features, attention patterns — and the cognitive-architecture question (Newell, Anderson) of which level the algorithms run on. The implementation level connects representations to the artificial neurons that realize them.
Beyond the framework, the deeper claim is that interpretability needs layered analysis rather than monolithic explanation. A complete account of why an LLM does what it does requires all three levels, and the disciplines that have learned to do this work for biological minds are the natural source of the methods.
Inquiring lines that use this note as a source 19
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- Why do some LLM clusters cite broader psychology than others?
- What distinguishes genuine cultural understanding from exploited surface-level elimination strategies?
- Can mechanistic interpretability reveal how ideologies decompose into simpler features?
- Do LLMs genuinely internalize human psychological structure or match surface patterns?
- How does cognitive fit theory explain why different tasks need different knowledge structures?
- What cognitive capacities do LLMs actually lack that commentary assumes they have?
- What cognitive abilities distinguish metalinguistic analysis from language use?
- Can LLMs have minimal introspection through causal linkage to internal states?
- Can quasi-interpretivism bridge functional description to moral status?
- Does functional integration determine cognitive system boundaries?
- How does methodological convenience in AI research become implicit ontology?
- How do structured benchmarks hide theory of mind failures in LLMs?
- How do classical mechanics and statistical mechanics provide methodological templates for learning theory?
- How does treating cognition as computation reshape education and work?
- Can spectral eigenvector ordering serve as a model-agnostic interpretability probe?
- What structural framework prevents LLM explanations from becoming just plausible fiction?
- How do mechanistic features compare to natural language for interpretability?
- How do mechanistic interpretability tools help distinguish truthfulness from honesty?
- How should we rethink the symbolism versus connectionism debate in light of LLMs?
Related concepts in this collection 4
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Can we predict where language models will fail?
Does characterizing the abstract computational problem an LLM solves—as a probability machine over sequences—let us predict which tasks it will struggle with systematically, before running experiments?
same paper, the computational-level example
-
Can indirect psychology tests reveal what LLMs conceal about bias?
Alignment training teaches LLMs to refuse direct questions about bias, but do implicit psychological methods like the IAT expose the underlying associations that remain encoded in their representations?
same paper, the algorithmic-level methodology
-
Can we understand LLM mechanisms with only representational analysis?
Explores whether mapping what information a model encodes is sufficient for mechanistic understanding, or whether causal verification is equally necessary to claim genuine mechanism.
same paper, the implementation-level methodology
-
Can computation arise without a conscious mapmaker?
Explores whether algorithms can generate the conscious agent needed to convert continuous physics into discrete symbols, or whether that agent must exist prior to computation itself.
adjacent (tension): challenges substrate-independence assumption underlying Marr's framework
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Levels of Analysis for Large Language Models
- Mechanistic Indicators of Understanding in Large Language Models
- Language Models’ Hall of Mirrors Problem: Why AI Alignment Requires Peircean Semiosis
- Computational structuralism: Toward a formal theory of meaning in the age of digital intelligence
- Probing Structured Semantics Understanding and Generation of Language Models via Question Answering
- Explain-Query-Test: Self-Evaluating LLMs Via Explanation and Comprehension Discrepancy
- Comprehension Without Competence: Architectural Limits of LLMs in Symbolic Computation and Reasoning
- Opportunities for large language models and discourse in engineering design
Original note title
Marr's three levels of analysis provide a structured toolkit for making LLMs interpretable