How do language models learn to think like humans?

How LLM architecture, training data, and internal representations shape cognition and computation.

Topic Hub · 83 linked notes · 15 sections

View as

Cognitive Models and Internal Computation

11 notes

Can language models learn to model human decision making?

Explores whether LLMs finetuned on psychological experiments can capture how people actually make decisions better than theories designed specifically for that purpose.

Can we detect memorable moments by observing emotional expressions?

Emotion recognition systems assume that detecting emotional moments will identify what people remember. But does observed emotion in group settings actually predict individual memorability, or does the proxy fail?

Do language models learn differently from good versus bad outcomes?

Do LLMs update their beliefs asymmetrically when learning from their own choices versus observing others? This matters for understanding whether agentic AI systems might inherit human cognitive biases.

Do language models segment events like human consensus does?

Can GPT-3 identify event boundaries in narrative text the way humans do? This matters because it could reveal whether language models and human cognition share similar predictive mechanisms for understanding continuous experience.

Do LLMs compress concepts more aggressively than humans do?

Do language models prioritize statistical compression over semantic nuance when forming conceptual representations, and how does this differ from human category formation? This matters because it may explain why LLMs fail at tasks requiring fine-grained distinctions.

How do language models encode syntactic relations geometrically?

Do LLM embeddings use distance alone or also direction to represent syntax? Understanding whether neural networks can spontaneously develop symbolic-compatible geometric structures.

Do transformers hide reasoning before producing filler tokens?

Explores whether language models compute correct answers in early layers but then deliberately overwrite them with filler tokens in later layers, suggesting reasoning and output formatting are separable processes.

Can explicit stack tracking improve how transformers learn recursive syntax?

Can adding an explicit stack tape to transformers help them track recursive structure more efficiently? This matters because standard transformers struggle with long-tail recursive patterns despite their size and data.

Can we decode what LLM activations really represent in language?

Can a trained decoder translate internal LLM activations into natural language descriptions, revealing what hidden representations actually encode? This matters because it could unlock both interpretability and controllability through the same mechanism.

Can communication pressure drive agents to learn shared abstractions?

Under what conditions do AI agents develop compact, efficient shared languages? This explores whether cooperative task pressure—rather than explicit optimization—naturally drives abstraction formation, mirroring human collaborative communication.

Can agents share thoughts directly without using language?

Explores whether multi-agent systems can communicate by exchanging latent thoughts extracted from hidden states, bypassing the ambiguity and misalignment problems inherent in natural language.

LLM Architecture and Scaling

20 notes

Can neural memory modules scale language models beyond attention limits?

Can separating short-term attention from adaptive long-term memory allow models to efficiently handle context windows exceeding 2M tokens while maintaining competitive performance?

Why do decoder-only models underperform as text encoders?

Decoder-only LLMs use causal attention, which limits each token to seeing only prior context. This explores whether removing this constraint could make them competitive universal encoders without architectural redesign.

Can byte-level models match tokenized performance with better efficiency?

Tokenized models use fixed vocabularies and allocate equal compute per token, but what if we dynamically group bytes based on prediction difficulty instead? Could this approach achieve competitive performance while using fewer FLOPs?

Can AI systems discover better neural architectures than humans?

Can multi-agent LLM systems, when structured with genetic programming, discover novel neural network designs that outperform human-engineered architectures? This matters because it could automate a critical bottleneck in AI research.

Can models dynamically activate expert skills at inference time?

Can language models efficiently discover and compose task-specific capabilities on the fly without modifying base weights? This explores whether test-time adaptation through expert vector composition outperforms fixed fine-tuning approaches.

Can algorithms control LLM reasoning better than LLMs alone?

Explores whether embedding LLMs within algorithmic control flow—where programs manage state and context filtering—enables complex task decomposition beyond what LLMs achieve through self-managed reasoning chains.

Do embedding dimensions fundamentally limit retrievable document combinations?

Can single-vector embeddings represent any top-k document subset a user might need? Research using communication complexity theory suggests there are hard geometric limits independent of training data or model architecture.

Do strict output formats hurt LLM reasoning ability?

When LLMs must produce structured JSON or XML with specific schemas, does this constrain their capacity for complex reasoning? This matters because production systems often enforce strict formats for parsing convenience.

Can language models actually use graph structure information?

After fine-tuning on graph data, do LLMs learn to use actual connectivity patterns, or just recognize that graphs exist? This matters for understanding whether transformers can handle structured reasoning tasks.

Can embedding future information in training data improve planning?

This explores whether inserting lookahead tokens containing future goals into training sequences helps models learn long-range planning without changing their architecture. The question matters because it tests whether data-level changes can produce architectural-level reasoning improvements.

Can recursive subtask trees overcome context window limits?

Explores whether modeling reasoning as prunable trees of subtasks could eliminate the context length constraints that currently force developers into multi-agent architectures. Asks if working memory can become truly unlimited through selective KV cache retention.

Can training data augmentation match test-time compute scaling benefits?

Can generating thinking trajectories during pretraining unlock the same efficiency gains that test-time scaling provides at inference? This explores whether the compute-allocation principle works across the training-inference boundary.

Can we prune training data without hurting model performance?

This explores whether difficulty metrics can identify redundant training examples that can be safely removed. It matters because most datasets contain massive waste — if we can find which examples are truly necessary, we could train better models on far less data.

Can reasoning happen at the sentence level instead of tokens?

Does moving from token-level to sentence-level reasoning in embedding space preserve the capability for complex reasoning while enabling language-agnostic processing? This challenges assumptions about how LLMs must operate.

Can retrieval knowledge compress into a tiny parametric model?

Can the information stored in large non-parametric retrieval datastores be compressed into a small trainable module? This matters because it could combine retrieval's knowledge benefits with the speed of pure parametric methods.

Can lookup memory and computation work together better than either alone?

Mixture-of-Experts handles dynamic logic, but static knowledge might need a different mechanism. Can a hybrid approach combining conditional computation with fast lookup outperform pure sparse models?

Can three axes replace the short-term long-term memory split?

Does breaking agent memory into forms, functions, and dynamics provide a clearer framework than the traditional short-term/long-term distinction? This matters because current agent-memory literature lacks a unified vocabulary, making comparison between systems nearly impossible.

Can brain memory systems explain how LLMs should store knowledge?

This explores whether the brain's three-tier memory architecture—neocortex, hippocampus, and prefrontal cortex—maps onto transformer weights, external knowledge stores, and agentic state. Understanding this mapping could reveal which AI memory problems each tier solves and which it cannot.

Is long-context bottleneck really about memory or compute?

Explores whether the challenge of handling long context windows stems from storage capacity limits or from the computational cost of transforming context into internal state. Understanding this distinction reshapes how we design language models.

Can recurrence consolidate memory without predicting tokens?

Recurrent neural networks typically use recurrence only for prediction. But could offline recurrent passes serve a second purpose—consolidating transient context into persistent weights, like sleep does in brains?

Core Ideas

2 notes

Does staying close to the base model preserve learning ability?

Explores whether limiting how far training pushes a model from its base distribution (measured by KL divergence) helps it learn new tasks more effectively over time, and why that trade-off matters for continual learning.

Can splitting adaptation into two channels reduce forgetting?

When language models adapt to new tasks, does separating task-specific learning (via prompt context) from persistent parameter updates help preserve both generalization ability and the model's original capabilities?

Training Data and Knowledge Formation

8 notes

Does training on AI-generated content permanently degrade model quality?

When generative models train on outputs from previous models, do the resulting models lose rare patterns permanently? The question matters because future training data will inevitably contain synthetic content.

Can models trained on many imperfect experts outperform each one?

Do generative models trained on diverse, imperfect human experts develop an implicit consensus that surpasses any individual contributor? This explores whether aggregating diverse perspectives at training time, rather than inference time, can denoise human biases.

Does instruction tuning teach task understanding or output format?

Exploring whether models trained on instructions actually learn the task semantics or merely learn to match output distributions. This matters because it challenges assumptions about how fine-tuning improves model behavior.

Does procedural knowledge drive reasoning more than factual retrieval?

Explores whether models learn reasoning through general procedures across diverse documents rather than memorizing specific facts. This matters for understanding what pretraining data actually teaches models to reason.

How much poisoned training data survives safety alignment?

Explores whether adversarial contamination at 0.1% of pretraining data can persist through post-training safety measures, and which attack types prove most resilient to alignment.

Do pretraining and fine-tuning scale independently in language models?

Can we decouple how model scale affects different training stages to independently improve factuality versus helpfulness? This matters for understanding whether these capabilities compete or can be optimized separately.

Does fine-tuning disconnect reasoning steps from final answers?

When models are fine-tuned on specific domains, do their chain-of-thought steps become less causally connected to their outputs? Three experiments test whether reasoning chains remain functionally faithful after training.

Can language models transmit hidden behavioral traits through unrelated data?

Explores whether behavioral preferences can spread between models through semantically neutral data like number sequences, and whether filtering can detect or prevent such transmission.

Multimodal Pretraining

3 notes

Are text-only language models fundamentally limited by abstraction?

Explores whether text's compression of physics, geometry, and causality into symbols creates an irreducible ceiling for language-only AI, and whether multimodal approaches can overcome this structural constraint.

Can we solve modality competition through architectural design?

Does modality competition in multimodal models stem from fundamental training conflicts, or from specific architectural choices? Understanding the root cause could reveal whether the trade-off is solvable.

Why do vision and language scale so differently?

IsoFLOP analysis reveals vision and language follow distinct scaling curves—vision demands far more training data than language at equivalent compute budgets. Understanding this asymmetry matters for designing multimodal architectures that serve both modalities well.

Sparse Attention Architecture

3 notes

Does sparse attention trade off quality for speed?

When sparse attention is compared fairly—larger sparse models versus smaller dense ones at the same compute cost—does it still represent a quality-cost trade-off, or does it actually improve performance?

Does fixed sparsity work for all sequence lengths?

Production systems often apply the same sparsity budget regardless of input length. Does this one-size-fits-all approach actually work across short and long contexts, or does optimal sparsity vary with sequence length?

How much sparsity can different reasoning tasks actually tolerate?

Different NLP tasks show vastly different tolerance for sparse attention—from 95% on simple QA to 50-67% on multi-hop reasoning. What structural differences explain this variation, and how should it shape deployment decisions?

Dense Retrieval Geometry

1 note

Does training for compositional sensitivity hurt dense retrieval?

Dense retrieval excels at topical recall but struggles with meaning-level distinctions. Adding structure-targeted negatives during training might improve compositional sensitivity—but at what cost to overall retrieval performance?

Pass 3 Additions (2026-05-03)

3 notes

Does depth matter more than width for tiny language models?

Explores whether deep-and-thin architectures outperform wide-and-shallow ones at sub-billion scales, and why this might contradict larger-model scaling laws.

What actually limits language models on mobile phones?

Is the shift toward smaller LLMs driven by quality trade-offs, or by hard physical constraints on device memory and battery life? This note examines whether sub-billion models are a practical necessity rather than a compromise.

What blocks scaling from language models to autonomous agents?

If large language models excel at next-token prediction, why do they struggle with long-horizon goal-oriented tasks? This explores whether the bottleneck is model capacity or the environments used to train them.

Data, Latents, and Scaling Dynamics — Batch #3 backlog (2026-06-03)

3 notes

Why is predicting latents more sample-efficient than tokens?

Explores whether learning from a network's own abstract representations requires far fewer training samples than learning from raw tokens, and what mechanism drives this efficiency gap.

Why do larger models learn rare tasks better?

Does model size enable learning of infrequent, complex tasks through greater representational capacity, or through some other mechanism? Understanding this matters for deciding whether scaling or data design is the more efficient lever.

What makes synthetic data work across different domains and models?

Explores whether a single optimal approach to synthetic data generation exists, or whether success depends on context like domain, model architecture, and scale. Understanding this matters for building effective data systems.

Tool-vs-weight, Context, Time, Memorization — Batch #3 wave 2 (2026-06-03)

5 notes

Can models store unlimited facts without growing larger?

Does external tool use let language models recall facts without being constrained by parameter count? This matters because it could reshape how we scale knowledge capacity beyond architectural limits.

Why can language models understand context better than generate it?

Models absorb and process rich input context far more effectively than they produce similarly sophisticated outputs. Understanding this asymmetry could reshape how we design systems to compensate for generative limitations.

Can routing mask future experts to prevent knowledge leakage?

Can models be built so that they respect query timestamps by selectively silencing experts trained on future data? This explores whether temporal causality can be enforced through architecture rather than external retrieval.

Does repeated sensitive data in fine-tuning cause memorization?

When language models train on the same private or proprietary data multiple times, how much do they end up memorizing and leaking that information at inference time? Understanding this risk is critical for organizations fine-tuning on confidential datasets.

Why do unified image generators fail on non-Latin scripts?

GPT-4o excels at multimodal generation across 20+ tasks, but systematically fails to render non-Latin scripts and underrepresented cultures accurately. What explains this specific failure mode in otherwise capable systems?

Copying, long-context, knowledge encoding, multimodal — Batch #3 backlog wave 3 (2026-06-03)

5 notes

Can state-space models match transformers at copying and retrieval?

Explores whether the efficiency gains of state-space models come at a fundamental cost in their ability to copy strings and retrieve exact information from context, compared to transformers.

Can recurrent memory scale where attention fails on ultra-long text?

GPT-4 and RAG plateau around 10,000 tokens and rely heavily on the first quarter of input. Can recurrent memory augmentation overcome these limits and enable reasoning across millions of tokens?

Does teaching question patterns before document training improve knowledge access?

Standard LLM training encodes documents first, then teaches QA patterns. But does this order matter? Exploring whether reversing the sequence—teaching how knowledge gets queried before encoding it—could unlock better factual recall.

Can bounding boxes replace image encoders for document understanding?

Explores whether spatial layout information alone, encoded as bounding boxes, can capture the multimodal signal needed for document understanding without expensive visual encoding. Matters because image encoders add significant computational cost to document processing systems.

Can generating entire videos at once beat keyframe interpolation?

Does synthesizing a video's full temporal duration in a single pass, rather than generating keyframes and filling gaps, produce more globally coherent motion? This explores whether pipeline decomposition fundamentally limits motion consistency.

Efficiency, scaling, training-dynamics, multimodal — Batch #4 backlog (2026-06-03)

8 notes

Can ternary weights match full precision model performance?

Can models trained natively with only three weight values (−1, 0, 1) achieve the same perplexity and task performance as standard full-precision models? This matters because ternary weights could dramatically reduce computational and energy costs.

Can editing hidden representations beat weight updates for finetuning?

Does intervening directly on a frozen model's representations offer a better path to parameter-efficient adaptation than current weight-based methods? This challenges the dominant PEFT paradigm by treating representations as the semantic lever instead.

Can asynchronous expert training beat synchronized distributed LLM training?

Can training domain-specialized LLM copies in parallel without synchronization, then merging their components into a routed mixture, achieve better efficiency and accuracy than keeping all copies synchronized?

Does optimal language model learning maximize data compression?

Can we derive principles for accelerating LM training by framing it as lossless compression? What does the optimal learning process look like when compression is the objective?

How should finetuning scale with model and data size?

What scaling laws govern finetuning performance across model size, pretraining data, and finetuning data? Understanding these relationships could guide resource allocation in real-world tuning scenarios.

Do networks recover from forgetting before re-encountering documents?

When language models train cyclically on repeated documents, do they anticipate upcoming material and recover from forgetting in advance? This challenges the standard catastrophic-interference narrative about sequential training.

Can consistency models trade speed for quality with a few steps?

Consistency models sample in one step but sacrifice quality compared to diffusion. Can adding just a handful of sampling steps recover the quality gap while staying faster than full diffusion?

Does multimodal zero-shot performance actually generalize or interpolate?

Explores whether multimodal models like CLIP truly generalize to unseen concepts or whether their impressive performance merely reflects memorization of frequently-seen concepts during pretraining.

Memory, knowledge-encoding, multimodal — Batch #5 backlog (2026-06-03)

3 notes

Can models learn working memory by attending to their own latents?

Can a feedback loop letting transformers attend to their own internal representations enable them to process indefinitely long sequences without adding extra weights? This explores whether working memory can emerge from self-attention rather than external modules.

Does fine-tuning on new facts increase hallucination risk?

When LLMs learn unfamiliar facts through fine-tuning, do they become more prone to hallucinating about things they already knew? Understanding this matters for safe knowledge updates.

Can a single model generate all modalities without external encoders?

Most multimodal systems rely on separate encoders for each modality. This research explores whether training a unified foundation model on discrete tokens across text, image, video, and speech can enable any-to-any generation without those external components.

Memory consolidation & continual learning — new papers (2026-06-03)

1 note

Can models consolidate memories during offline sleep phases?

This explores whether LLMs can use dedicated offline periods to consolidate short-term learning into permanent weights, avoiding catastrophic forgetting and the need for expensive retraining.