TOPIC

Design Frameworks

18 synthesis notes · 61 source papers
View as

Do formal language prototypes improve reasoning across different domains?

Can training language models on abstract reasoning patterns in Prolog and PDDL help them generalize to new reasoning tasks? This tests whether shared logical structures underlie seemingly different problem domains.

Explore related Read →

Where does agent reliability actually come from?

Exploring whether LLM agent performance depends on larger models or on thoughtful system design choices like memory, skills, and protocols that shift cognitive work outside the model.

Explore related Read →

Why do AI agents miss most of what users actually want?

UserBench explores why current models align with user intent only 20% of the time, even when users reveal preferences across multiple turns. The question examines whether agents can learn to actively clarify ambiguous or evolving goals.

Explore related Read →

How should chatbot design vary by relationship duration?

Do chatbots serving one-time users need different design than those supporting long-term relationships? This matters because applying the same design to all temporal profiles creates usability mismatches.

Explore related Read →

Should AI systems stay collaborative rather than fully autonomous?

Explores whether keeping humans in the loop with AI agents is more reliable than pursuing full autonomy. Investigates whether collaboration solves problems that autonomous systems structurally cannot.

Explore related Read →

Can designers shape LLM behavior without deep technical knowledge?

Explores whether LLMs can be treated as adaptable design materials that designers can tinker with directly, rather than fixed components handed over by engineers. Matters because it determines whether user-centered judgment reaches model adaptation early.

Explore related Read →

Do firms substitute labor for AI at different rates?

Explores whether companies exposed to AI shocks replace contracted workers with AI tools uniformly or at varying rates, and what firm-level differences reveal about the economics of AI adoption.

Explore related Read →

Do generated interfaces outperform text-based chat for most tasks?

Explores whether LLMs should create interactive UIs instead of text responses, and under what conditions users prefer dynamic interfaces to traditional conversational chat.

Explore related Read →

How should users control systems with unpredictable outputs?

When generative AI produces different outputs from identical inputs, how do interaction design principles help users maintain control and develop effective mental models for stochastic systems?

Explore related Read →

How do communication modalities shape human-agent collaboration patterns?

Does varying how humans and agents exchange information—text, voice, or structured channels—produce measurably different negotiation, trust, and awareness outcomes in collaborative tasks?

Explore related Read →

When should human-agent systems ask for human help?

Explores the timing problem in collaborative AI systems: since there's no objective metric for optimal interruption, how can we design deferral mechanisms that know when to involve humans without constant disruption or silent failures?

Explore related Read →

Why do people share more openly with machines than humans?

Does the absence of social goals in human-machine communication explain why people disclose sensitive information more readily to chatbots? Understanding this mechanism could reshape how we design conversational AI.

Explore related Read →

Do humans apply human-human scripts to AI interactions?

Does CASA theory correctly explain how people interact with media agents, or have decades of technology use created separate interaction scripts? Understanding which scripts drive behavior matters for AI design.

Explore related Read →

Can language models discover what users actually want from activity logs?

Users pursue month-long interest journeys that transcend individual item clicks. Can LLMs extract these persistent goals from behavioral patterns, and does this change how we should think about personalization?

Explore related Read →

Why do LLMs excel at feasible design but struggle with novelty?

When LLMs generate conceptual product designs, they produce more implementable and useful solutions than humans but fewer novel ones. This explores why domain constraints flip the novelty advantage seen in research ideation.

Explore related Read →

Do more social cues always make AI feel more present?

Explores whether quantity of social cues matters as much as their quality in triggering social responses to AI. Tests whether multiple weak cues can substitute for one strong one.

Explore related Read →

Can user embeddings personalize language models more efficiently than prompts?

Does distilling user interaction history into learned embeddings outperform stuffing that history directly into prompts for personalizing large language models? This matters because interaction data is long and expensive to process as tokens.

Explore related Read →

Can AI systems preserve moral value conflicts instead of averaging them?

Current AI systems wash out value tensions through majority aggregation. Can we instead model how values like honesty and friendship genuinely conflict in moral reasoning?

Explore related Read →

Source papers 61

The Arxiv papers behind this sub-topic. Links may take you off-site to arxiv.org.