SYNTHESIS NOTE
Reasoning, Retrieval, and Evaluation Language, Text, and Discourse

Why do LLMs accept logical fallacies more than humans?

LLMs fall for persuasive but invalid arguments at much higher rates than humans. This explores whether reasoning models genuinely evaluate logic or simply mimic argument structure.

Synthesis note · 2026-02-21 · sourced from Argumentation
What kind of thing is an LLM really? How should researchers navigate LLM reasoning research?

The LOGICOM benchmark tests a specific capability most LLM evaluations ignore: resistance to invalid arguments that are persuasively delivered. The finding is striking. LLMs are 41% more likely to accept weak logical fallacies and 69% more likely to accept strongly delivered fallacies than human participants. Reasoning-optimized models (o1, R1) show no meaningful advantage over standard models.

What this reveals is a structural problem, not a surface one. LLMs are trained to be responsive to the rhetorical features of language — fluency, confidence, elaboration — because these features correlate with quality in the training distribution. But this correlation breaks under adversarial conditions. A confident, well-elaborated fallacy triggers the same responsiveness signals as a confident, well-elaborated valid argument. The model has no internal fallacy detector that operates independently of rhetorical quality.

This is different from the hallucination problem. Hallucinations involve generating false content from within. Fallacy susceptibility involves accepting false content from without. The failure mode is about input validation under persuasive framing, not output generation.

The finding also complicates the reasoning model narrative. If chain-of-thought were doing genuine logical evaluation, reasoning models should be more resistant — they are explicitly working through the argument structure. That they are not suggests CoT is mimicking the surface form of argument analysis without performing its function. Do language models actually use their reasoning steps? provides the mechanism: CoT steps may be causally sufficient to generate the answer but not causally necessary to the reasoning process.

The implication for deployment: LLMs used in debate, argumentation, or adversarial contexts — legal AI, negotiation support, policy analysis — inherit this susceptibility. Any system that can be prompted with persuasive text is a system that can be convinced of invalid conclusions through rhetorical quality alone.

LogicBench extends this to systematic evaluation across logical reasoning types. LLMs struggle specifically with instances involving complex reasoning, negations, and non-monotonic reasoning. The non-monotonic finding is particularly revealing: formalizing "normally," "typically," and "usually" — concepts that allow exceptions to general rules — is beyond classical first-order quantifiers. LLMs must handle default reasoning, reasoning about unknown expectations, and reasoning about priorities, all of which require the ability to recognize and process exceptions. This connects to Why do reasoning models fail at exception-based rule inference?: exception handling is a shared failure point across both adversarial robustness and logical reasoning evaluations. NLSat additionally shows that transformers can be surprisingly robust on hard propositional satisfiability instances with sufficient training, suggesting the bottleneck is not raw computational capacity but the ability to handle negation, exceptions, and non-standard logical connectives.

Inquiring lines that use this note as a source 13

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 7

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
20 direct connections · 203 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

llms are susceptible to logical fallacies 41 to 69 percent more often than humans revealing that reasoning robustness fails under adversarial framing