SYNTHESIS NOTE

How do logic units preserve procedural coherence better than chunks?

Can structured retrieval units with prerequisites, headers, bodies, and linkers maintain step-by-step coherence in how-to answers where fixed-size chunks fail? This matters because procedural questions require sequential logic and conditional branching that chunk-based RAG cannot support.

Synthesis note · 2026-02-22 · sourced from Question Answer Search

RAG systems overwhelmingly use fixed-size chunks as their retrieval granularity. This works acceptably for factoid "5W" questions (who, what, where, when, why) where the answer is localized. It fails systematically for "1H" questions — how-to questions — which require sequential, procedurally coherent answers where step ordering, prerequisites, and conditional branching matter.

THREAD proposes logic units (LUs) as an alternative retrieval granularity with four components:

Prerequisite: information needed to understand the LU — domain terminology, abbreviations, constraints that must be met. Functions both as context supplement (preventing hallucination from decontextualized chunks) and as filter (excluding irrelevant LUs based on unmet constraints).
Header: summary or intent description, used for indexing. Unlike chunks that index the entire content, headers enable intent-based retrieval — matching queries to the purpose of the LU rather than its surface content.
Body: detailed content — specific actions, code blocks, instructions. The core material fed to the LLM generator.
Linker: bridge to subsequent logic units. Specifies what comes next — multiple possibilities after taking an action, guiding retrieval of the next-step LU. This is the critical innovation: it enables dynamic, multi-step answer construction where each step's outcome determines the next retrieval.

The linker is what makes THREAD fundamentally different from chunk-based RAG. Chunks have no mechanism for specifying what should come next — retrieval of subsequent chunks relies on the same query or the generated partial answer, both of which degrade as the procedure progresses. Linkers provide explicit navigation between steps, enabling branching paths (if server load is high → do X; if normal → do Y).

This connects to the broader RAG failure mode. Since Do vector embeddings actually measure task relevance?, the chunk+embedding approach fails for procedural questions doubly: embeddings can't capture sequential dependency, and chunks can't preserve it. Logic units address both by structuring retrieval around intent (header) and navigation (linker) rather than semantic similarity.

Inquiring lines that use this note as a source 9

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 4

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

15 direct connections · 104 in 2-hop network ·medium cluster Open in graph ↗

How do logic units preserve procedural coherence… Do vector embeddings actually measure task relevan… What do enterprise RAG systems need beyond accurac… When do graph databases outperform vector embeddin… Does question type determine the right retrieval s…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Do vector embeddings actually measure task relevance? Vector embeddings rank semantic similarity, but RAG systems need topical relevance. When these diverge—as with king/queen versus king/ruler—does similarity-based retrieval fail in production?
logic units address the task-relevance gap by indexing on intent (headers) rather than semantic similarity
What do enterprise RAG systems need beyond accuracy? Academic RAG benchmarks focus on question-answering accuracy, but enterprise deployments in regulated industries face five distinct requirements—compliance, security, scalability, integration, and domain expertise—that standard architectures don't address.
logic units address the coherence and reliability requirements that enterprise RAG needs
When do graph databases outperform vector embeddings for retrieval? Vector similarity struggles with aggregate and relational queries that require traversing multiple entity connections. Can graph-oriented databases with deterministic queries solve this failure mode in enterprise domain applications?
linkers in logic units implement a lightweight form of relational traversal within the document structure
Does question type determine the right retrieval strategy? Explores whether different non-factoid question types require distinct retrieval and decomposition approaches. Matters because standard RAG fails when applied uniformly to debate, comparison, and experience questions despite being effective for factoid queries.
how-to questions are a specific NFQ type requiring procedural coherence that logic units provide

How do logic units preserve procedural coherence better than chunks?

Related concepts in this collection 4

Related papers in this collection 8

Search by related questions 4