TOPIC

NLP and Linguistics

36 synthesis notes · 137 source papers
View as

Why do speakers deliberately use ambiguous language?

Explores whether ambiguity is a linguistic defect or a strategic tool speakers use for efficiency, politeness, and deniability. Matters because it challenges how we train language systems.

Explore related Read →

Why do clarification requests look different at each communication level?

Explores whether clarifications are unified speech acts or distinct mechanisms grounded in different modalities. Matters because dialogue systems treat clarifications uniformly, missing most of them.

Explore related Read →

Why do speakers need to actively calibrate shared reference?

Explores whether using the same words guarantees speakers mean the same thing. Investigates how referential grounding differs across people and what collaborative work is needed to establish true understanding.

Explore related Read →

Are models actually reasoning about constraints or just defaulting conservatively?

Do language models genuinely apply constraints when solving problems, or do they simply prefer harder options by default? Minimal pair testing reveals whether apparent reasoning success masks hidden biases.

Explore related Read →

Do language models show the same content effects humans do?

Do LLMs reproduce human reasoning biases—like believing conclusions based on familiarity rather than logic—across different logical tasks? This matters because converging patterns across independent tasks suggest a fundamental architectural property rather than a task-specific quirk.

Explore related Read →

Do harder reasoning tasks trigger more semantic bias?

Does the difficulty of a logical task determine how much semantic content influences reasoning? This matters because it reveals whether we can isolate 'pure' logical reasoning in benchmarks.

Explore related Read →

Do language models fail reasoning tests that humans pass?

Standard critiques claim LLMs lack real reasoning ability, but do humans actually perform better on content-independent reasoning tasks? Examining whether the cognitive bar differs for artificial versus human intelligence.

Explore related Read →

Does language understanding happen only in the language system?

Explores whether the brain's core language system alone can produce genuine understanding, or whether deep comprehension requires dispatching information to perception, motor, and memory regions.

Explore related Read →

What formal languages actually help transformers learn natural language?

Not all formal languages are equally useful for pre-pretraining. This explores which formal languages transfer well to natural language and why—combining structural requirements with what transformers can actually learn.

Explore related Read →

Why do confident wrong answers hide in standard accuracy metrics?

When AI systems produce fluent but incorrect recommendations in high-stakes domains, standard accuracy evaluation may miss the failures entirely. What structural blind spot allows these errors to remain invisible?

Explore related Read →

Can language models learn meaning from text patterns alone?

Explores whether training on form alone—predicting the next word from prior words—could ever give language models access to communicative intent and genuine semantic understanding.

Explore related Read →

What makes linguistic agency impossible for language models?

From an enactive perspective, does linguistic agency require embodied participation and real stakes that LLMs fundamentally lack? This matters because it challenges whether LLMs can truly engage in language or only generate text.

Explore related Read →

What hidden assumptions drive how we build language models?

Large language models rest on two unstated assumptions about language and data. Understanding what engineers assume—and what enactive linguistics challenges—matters for knowing what LLMs actually can and cannot do.

Explore related Read →

Why does removing spurious cues sometimes hurt model performance?

Most models improve when spurious features are removed, but some fail worse. This note explores whether that failure represents a fundamentally different problem than traditional shortcut learning.

Explore related Read →

Why do language models fail to use knowledge they possess?

Large language models contain relevant world knowledge but often fail to activate it without explicit cues. This explores whether the bottleneck lies in knowledge storage or in the inference process that decides what background facts apply.

Explore related Read →

Can language models adapt implicature to conversational context?

Do large language models flexibly modulate scalar implicatures based on information structure, face-threatening situations, and explicit instructions—as humans do? This tests whether pragmatic computation is truly context-sensitive or merely literal.

Explore related Read →

Does semantic grounding in language models come in degrees?

Rather than asking whether LLMs truly understand meaning, this explores whether grounding is actually a multi-dimensional spectrum. The question matters because it reframes the sterile understand/don't-understand debate into measurable, distinct capacities.

Explore related Read →

Can LLMs acquire social grounding through linguistic integration?

Explores whether LLMs gradually develop social grounding as they become embedded in human language practices, analogous to child language acquisition. Tests whether grounding is a fixed property or an outcome of participatory use.

Explore related Read →

Should we call LLM errors hallucinations or fabrications?

Does the language we use to describe LLM failures shape the technical solutions we build? Examining whether perceptual and psychological frameworks misdiagnose what's actually happening.

Explore related Read →

Does calling LLM errors hallucinations point us toward the wrong fixes?

Explores whether the metaphor of 'hallucination' for LLM errors misdirects our efforts. The terminology we choose shapes which interventions we prioritize and how we conceptualize the underlying problem.

Explore related Read →

Can language models actually analyze language structure?

Explores whether LLMs can move beyond pattern matching to perform genuine metalinguistic analysis like syntactic tree construction and phonological reasoning, and what enables this capability.

Explore related Read →

Can large language models develop genuine world models without direct environmental contact?

Do LLMs extract meaningful world structures from human-generated text despite lacking direct sensory access to reality? This matters for understanding what kind of grounding and knowledge these systems actually possess.

Explore related Read →

Can language models recognize when text is deliberately ambiguous?

Explores whether LLMs can identify and handle multiple valid interpretations in a single phrase—a core human language skill that appears largely absent in current models despite their fluency on standard tasks.

Explore related Read →

Do language models learn abstract grammar or cultural speech patterns?

LLMs might learn more than grammar rules—they could be learning who says what to whom and when. This matters because it changes how we understand what biases and persona effects actually represent.

Explore related Read →

Can language models learn meaning without engaging the world?

Explores whether LLMs prove that meaning emerges from relational structure alone, independent of embodied experience or external reference. Tests structuralist theory empirically.

Explore related Read →

Do language models actually build shared understanding in conversation?

When LLMs respond fluently to prompts, do they perform the communicative work humans do to establish mutual understanding? Research suggests they skip the grounding acts that make dialogue reliable.

Explore related Read →

Why do language models fail at communicative optimization?

LLMs excel at learning surface statistical patterns from text but struggle with deeper principles of how language achieves efficient communication. What distinguishes these two types of linguistic knowledge?

Explore related Read →

Do language models ignore goals when surface cues conflict?

When a task has an obvious surface cue that contradicts an unstated requirement, do LLMs follow the cue or the actual goal? This matters because it reveals whether reasoning failures come from missing knowledge or from how models weight competing signals.

Explore related Read →

Do standard NLP benchmarks hide LLM ambiguity failures?

When benchmark creators filter out ambiguous examples before testing, do they accidentally make it impossible to measure whether language models can actually handle ambiguity the way humans do?

Explore related Read →

Can formal language pretraining make language models more efficient?

Does training language models on hierarchical formal languages before natural language improve how efficiently they learn syntax? This explores whether structural inductive biases in training data matter more than raw data volume.

Explore related Read →

Does preference optimization damage conversational grounding in large language models?

Exploring whether RLHF and preference optimization actively reduce the communicative acts—clarifications, acknowledgments, confirmations—that build shared understanding in dialogue. This matters for high-stakes applications like medical and emotional support.

Explore related Read →

Why do readers interpret the same sentence so differently?

How much of annotation disagreement in NLP reflects genuine interpretive multiplicity rather than error? This explores whether social position and moral framing systematically generate competing but equally valid readings.

Explore related Read →

Do LLMs gain true linguistic agency through integration?

Explores whether LLMs can develop genuine linguistic agency—the capacity to be embodied, stake-bearing participants in meaning-making—as they become embedded in human language practices, or whether this requires fundamental architectural changes.

Explore related Read →

Why do language models skip the calibration step?

Current LLMs assume shared understanding rather than building it through dialogue. This explores why that design choice persists and what breaks when it fails.

Explore related Read →

Does preference optimization harm conversational understanding?

Exploring whether RLHF training that rewards confident, complete responses undermines the grounding acts—clarifications, checks, acknowledgments—that actually build shared understanding in dialogue.

Explore related Read →

Why do language models sound fluent without grounding?

Explores whether LLM fluency masks the absence of communicative work—the clarifying questions, acknowledgments, and understanding checks that humans perform. Why does skipping these acts make models sound more confident?

Explore related Read →

Source papers 137

The Arxiv papers behind this sub-topic. Links may take you off-site to arxiv.org.