SYNTHESIS NOTE
Reasoning, Retrieval, and Evaluation Language, Text, and Discourse Conversational AI and Personalization

Can tailoring queries per document improve debatable summarization?

When summarizing documents with opposing perspectives on a topic, does adapting the query to each document's unique content retrieve more balanced viewpoints than using a single uniform query?

Synthesis note · 2026-02-23 · sourced from Agents Multi
What makes multi-agent teams actually perform better? How should retrieval and reasoning integrate in RAG systems?

When a query has opposing but equally valid perspectives across documents ("Is law school worth it?"), standard summarization fails in two specific ways. First, using the same query to retrieve contexts from every document misses document-specific perspectives — a query about "career outcomes" may not retrieve a document's strongest arguments about "personal growth." Second, merging free-form intermediate outputs requires extra reasoning to extract, classify, and compare perspectives, distracting from balanced summary generation.

MODS (Moderating a Mixture of Document Speakers) applies a panel discussion metaphor. Each document gets its own Speaker LLM that responds to tailored queries using only its document's content. A Moderator LLM plans an agenda of topics, selects relevant speakers per topic, and tailors a specific query to each selected speaker. Speakers retrieve their document's context relevant to the tailored query and report both "yes" and "no" perspectives. The moderator tracks all perspectives in a structured outline, which guides the final summary.

The results are substantial: 38-58% improvement in topic paragraph coverage and balance over baselines. The mechanism is the tailored query — by asking each document-speaker a question aligned to that document's unique expertise, MODS retrieves perspectives that a uniform query would miss. This is a retrieval problem disguised as a summarization problem.

The design insight generalizes beyond debatable summarization. Any task where multiple sources have different relevant expertise benefits from source-specific querying rather than uniform querying. Since Do hierarchical retrieval architectures outperform flat ones on complex queries?, MODS extends this principle: not just separating planning from synthesis, but also specializing the query per source.

The connection to Does including all conversation history actually help retrieval? is direct: MODS solves the same problem at the document level that selective history solves at the conversation level — irrelevant context degrades retrieval, and the fix is source-aware filtering.

Complementary approach — reranking-based perspective summarization: Where MODS specializes at the retrieval stage (tailored queries per document), reranking-based methods operate at the generation stage: generate multiple candidate summaries, then rerank for coverage and faithfulness. Reranking consistently outperforms prompting frameworks for perspective summarization — even when prompting is scaled to high-resource settings. DPO on reranked self-generated summaries further boosts both attributes, with the most pronounced gains in faithfulness. Additionally, LM-based evaluation metrics (AlignScore, prompting-based scoring) substantially outperform traditional metrics (ROUGE, BERTScore) for measuring perspective summary quality. MODS and reranking address different bottlenecks: MODS ensures diverse perspectives are retrieved, reranking ensures the generated summary faithfully represents them.

Inquiring lines that use this note as a source 5

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 4

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map
14 direct connections · 135 in 2-hop network ·dense cluster Open in graph ↗

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

debatable query-focused summarization requires per-document speaker specialization with moderator orchestration