TOPIC

Design Frameworks

18 synthesis notes · 61 source papers

View as

Do formal language prototypes improve reasoning across different domains?

Can training language models on abstract reasoning patterns in Prolog and PDDL help them generalize to new reasoning tasks? This tests whether shared logical structures underlie seemingly different problem domains.

Where does agent reliability actually come from?

Exploring whether LLM agent performance depends on larger models or on thoughtful system design choices like memory, skills, and protocols that shift cognitive work outside the model.

Why do AI agents miss most of what users actually want?

UserBench explores why current models align with user intent only 20% of the time, even when users reveal preferences across multiple turns. The question examines whether agents can learn to actively clarify ambiguous or evolving goals.

How should chatbot design vary by relationship duration?

Do chatbots serving one-time users need different design than those supporting long-term relationships? This matters because applying the same design to all temporal profiles creates usability mismatches.

Should AI systems stay collaborative rather than fully autonomous?

Explores whether keeping humans in the loop with AI agents is more reliable than pursuing full autonomy. Investigates whether collaboration solves problems that autonomous systems structurally cannot.

Can designers shape LLM behavior without deep technical knowledge?

Explores whether LLMs can be treated as adaptable design materials that designers can tinker with directly, rather than fixed components handed over by engineers. Matters because it determines whether user-centered judgment reaches model adaptation early.

Do firms substitute labor for AI at different rates?

Explores whether companies exposed to AI shocks replace contracted workers with AI tools uniformly or at varying rates, and what firm-level differences reveal about the economics of AI adoption.

Do generated interfaces outperform text-based chat for most tasks?

Explores whether LLMs should create interactive UIs instead of text responses, and under what conditions users prefer dynamic interfaces to traditional conversational chat.

How should users control systems with unpredictable outputs?

When generative AI produces different outputs from identical inputs, how do interaction design principles help users maintain control and develop effective mental models for stochastic systems?

How do communication modalities shape human-agent collaboration patterns?

Does varying how humans and agents exchange information—text, voice, or structured channels—produce measurably different negotiation, trust, and awareness outcomes in collaborative tasks?

When should human-agent systems ask for human help?

Explores the timing problem in collaborative AI systems: since there's no objective metric for optimal interruption, how can we design deferral mechanisms that know when to involve humans without constant disruption or silent failures?

Why do people share more openly with machines than humans?

Does the absence of social goals in human-machine communication explain why people disclose sensitive information more readily to chatbots? Understanding this mechanism could reshape how we design conversational AI.

Do humans apply human-human scripts to AI interactions?

Does CASA theory correctly explain how people interact with media agents, or have decades of technology use created separate interaction scripts? Understanding which scripts drive behavior matters for AI design.

Can language models discover what users actually want from activity logs?

Users pursue month-long interest journeys that transcend individual item clicks. Can LLMs extract these persistent goals from behavioral patterns, and does this change how we should think about personalization?

Why do LLMs excel at feasible design but struggle with novelty?

When LLMs generate conceptual product designs, they produce more implementable and useful solutions than humans but fewer novel ones. This explores why domain constraints flip the novelty advantage seen in research ideation.

Do more social cues always make AI feel more present?

Explores whether quantity of social cues matters as much as their quality in triggering social responses to AI. Tests whether multiple weak cues can substitute for one strong one.

Can user embeddings personalize language models more efficiently than prompts?

Does distilling user interaction history into learned embeddings outperform stuffing that history directly into prompts for personalizing large language models? This matters because interaction data is long and expensive to process as tokens.

Can AI systems preserve moral value conflicts instead of averaging them?

Current AI systems wash out value tensions through majority aggregation. Can we instead model how values like honesty and friendship genuinely conflict in moral reasoning?

Source papers 61

The Arxiv papers behind this sub-topic. Links may take you off-site to arxiv.org.

A Call for Collaborative Intelligence: Why Human-Agent Systems Should Precede AI Autonomy
Recent improvements in large language models (LLMs) have led many researchers to focus on building fully autonomous AI agents. This position paper questions whether this approach is the right path for…
Agent S: An Open Agentic Framework that Uses Computers Like a Human
We present Agent S, an open agentic framework that enables autonomous interaction with computers through a Graphical User Interface (GUI), aimed at transforming human-computer interaction by automatin…
An extended framework for characterizing social robots
1.2 Brief summary of frameworks for characterizing social robots Before outlining the content of our framework, it is useful to first look at existing frameworks for classifying social robots. In part…
Bridging the gulf of envisioning: Cognitive design challenges in llm interfaces.
Large language models (LLMs) exhibit dynamic capabilities and appear to comprehend complex and ambiguous natural language prompts. However, calibrating LLM interactions is challenging for interface de…
Building Machines that Learn and Think with People
What do we want from machine intelligence? We envision machines that are not just tools for thought, but partners in thought: reasonable, insightful, knowledgeable, reliable, and trustworthy systems t…
Building a Stronger CASA: Extending the Computers Are Social Actors Paradigm
The computers are social actors framework (CASA), derived from the media equation, explains how people communicate with media and machines demonstrating social potential. Many studies have challenged …
Canvil: Designerly Adaptation for LLM-Powered User Experiences
Advancements in large language models (LLMs) are poised to spark a proliferation of LLM-powered user experiences. In product teams, designers are often tasked with crafting user experiences that align…
ChatGPT Reads Your Tone and Responds Accordingly -- Until It Does Not -- Emotional Framing Induces Bias in LLM Outputs
Background: Large Language Models (LLMs) like GPT-4 tailor their responses not just to the content but also to the tone of user prompts. Prior work has hinted that emotional phrasing – whether optimis…
Conceptual Design Generation Using Large Language Models
ABSTRACT Concept generation is a creative step in the conceptual design phase, where designers often turn to brainstorming, mindmapping, or crowdsourcing design ideas to complement their own knowledge…
Considering the Context to Build Theory in HCI, HRI, and HMC: Explicating Differences in Processes of Communication and Socialization With Social Technologies
our research can be outpaced by developments in the modern technological landscape. To address this issue, we often focus our inquiries conceptually rather than technically through an affordance-based…
Conversational DNA: A New Visual Language for Understanding Dialogue Structure in Human and AI
What if the patterns hidden within dialogue reveal more about communication than the words themselves? We introduce Conversational DNA, a novel visual language that treats any dialogue – whether betwe…
Conversational Prompt Engineering
Conversational Prompt Engineering (CPE), a user-friendly tool that helps users create personalized prompts for their specific tasks. CPE uses a chat model to briefly interact with users, helping them …
Design Principles for Generative AI Applications
Generative AI applications present unique design challenges. As generative AI technologies are increasingly being incorporated into mainstream applications, there is an urgent need for guidance on how…
Disambiguating Anthropomorphism and Anthropomimesis in Human-Robot Interaction
Henry Shevlin [[Emotions]] [[Psychology Users]] [[Design Frameworks]] In this preliminary work, we offer an initial disambiguation of the theoretical concepts anthropomorphism and anthropomimesis in…
Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization
strategic team of agents communicating in a dynamic interaction architecture based on the task query. Specifically, we build a framework named Dynamic LLM-Agent Network (DyLAN) for LLM-agent collabora…
Enhancing Pipeline-Based Conversational Agents with Large Language Model
“This paper proposes a hybrid approach that leverages LLMs, in particular GPT-4, to enhance pipeline-based CAs. Using this approach, maintainers of existing CAs can adopt new domains and overcome the …
Expedient Assistance and Consequential Misunderstanding: Envisioning an Operationalized Mutual Theory of Mind
Design fictions allow us to prototype the future. They enable us to interrogate emerging or non-existent technologies and examine their implications. We present three design fictions that probe the po…
Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering
Large language model (LLM) agents are increasingly built less by changing model weights than by reorganizing the runtime around them. Capabilities that earlier systems expected the model to recover in…
Flows: Building Blocks of Reasoning and Collaborating AI
Recent advances in artificial intelligence (AI) have produced highly capable and controllable systems. This creates unprecedented opportunities for structured reasoning as well as collaboration among …
Foundations of Large Language Models
The main part of BERT models is a multi-layer Transformer network. A Transformer layer consists of a self-attention sub-layer and an FFN sub-layer. Both of them follow the post-norm architecture: outp…
From speaking like a person to being personal: The effects of personalized, regular interactions with conversational agents
human-AI interactions (Sundar, 2020). Interactions with these agents may have a shorter-term, transactional nature, for example checking the status of an order with a customer service chatbot, or a lo…
Generative Agent Simulations of 1,000 People
We present a novel agent architecture that simulates the attitudes and behaviors of 1,052 real individuals—applying large language models to qualitative interviews about their lives, then measuring ho…
Generative Interfaces for Language Models
Large language models (LLMs) are increasingly seen as assistants, copilots, and consultants, capable of supporting a wide range of tasks through natural conversation. However, most systems remain cons…
How well can large language models explain business processes?
One such system’s functionality is Situation-Aware eXplainability (SAX), which relates to generating causally sound and yet human-interpretable explanations that take into account the process context …
Interactive Evaluation Requires a Design Science
AI evaluation is undergoing a structural change. Large language models (LLMs) are increasingly deployed as systems that act over time through tools, environments, users, and other agents, yet many eva…
Large Language Models for User Interest Journeys
“Large language models (LLMs) have shown impressive capabilities in natural language understanding and generation. Their potential for deeper user understanding and improved personalized user experien…
Learning Human-Object Interaction as Groups
Human-Object Interaction Detection (HOI-DET) aims to localize human-object pairs and identify their interactive relationships. To aggregate contextual cues, existing methods typically propagate inform…
Machine ex machina: A Framework Decentering the Human in AI Design Praxis
we propose a framework for decentering the human in AI design. The theoretical principles of HMC and the work of feminist STS scholars are influenced by Bruno Latour’s “actor-network theory” or ANT. …
Magentic-UI: Towards Human-in-the-loop Agentic Systems
AI agents powered by large language models are increasingly capable of autonomously completing complex, multi-step tasks using external tools. Yet, they still fall short of humanlevel performance in m…
Opportunities for large language models and discourse in engineering design
In this paper, we argue that foundation models such as LLMs can be used for creative reasoning tasks in the engineering design process, complementing and integrating existing computational methods suc…
Payrolls to Prompts: Firm-Level Evidence on the Substitution of Labor for AI
Introduction. AI is profoundly reshaping the nature of work. Every day, stories abound about startups disrupting industries, new models being released, and the anxiety of legacy companies getting left…
Personalization of Large Language Models: A Survey
Personalization of Large Language Models (LLMs) has recently become increasingly important with a wide range of applications. Despite the importance and recent progress, most existing works on persona…
PosterMate: Audience-driven Collaborative Persona Agents for Poster Design
PosterMate gathers feedback from each persona agent regarding poster components, and stimulates discussion with the help of a moderator to reach a conclusion. These agreed-upon edits can then be direc…
Proactive behavior in voice assistants: A systematic review and conceptual model
Yet, there is a lack of review studies synthesizing the current knowledge on how proactive behavior has been implemented in VAs and under what conditions proactivity has been found more or less suitab…
ProtoReasoning: Prototypes as the Foundation for Generalizable Reasoning in LLMs
We hypothesize that cross-domain generalization arises from shared abstract reasoning prototypes — fundamental reasoning patterns that capture the essence of problems across domains. These prototypes …
Psychological, Relational, and Emotional Effects of Self-Disclosure After Conversations With a Chatbot
identity of a conversation partner, as a human or computer, matters. Previous work has found that the mere perceived identity of the partner as computer or human has profound effects, even when actual…
Reflections and New Directions for Human-Centered Large Language Models
Large Language Models (LLMs) are increasingly shaping the private and professional lives of users, with numerous applications in business, education, finance, healthcare, law, and science. With this r…
Rhetorical XAI: Explaining AI’s Benefits as well as its Use via Rhetorical Design
Modern AI systems are notoriously opaque, limiting efforts to understand or audit their behaviors [42, 188]. In response, Explainable Artificial Intelligence (XAI) aims to foster trust and accountabil…
Rise of Machine Agency: A Framework for Studying the Psychology of Human–AI Interaction (HAII)
Communication scholars began studying our interactions with the technologies themselves. Several studies documented our tendency to treat computers as if they are autonomous social actors (Reeves & Na…
See you soon again, chatbot? A design taxonomy to characterize user-chatbot relationships with different time horizons
Users interact with chatbots for various purposes and motivations – and for different periods of time. However, since chatbots are considered social actors and given that time is an essential componen…
Simulating Society Requires Simulating Thought
Simulating society with large language models (LLMs), we argue, requires more than generating plausible behavior; it demands cognitively grounded reasoning that is structured, revisable, and traceable…
Social Responses to Media Technologies in the 21st Century: The Media are Social Actors Paradigm
we propose the **Media are Social Actors** (MASA) paradigm as a structured extension of the CASA paradigm. We suggest that an enhanced framework that builds on the CASA paradigm, expounds the effects …
Social Robots for Long-Term Interaction: A Survey
Abstract As the field of HRI evolves, it is important to understand how users interact with robots over long periods. This paper reviews the current research on long-term interaction between users and…
Systematic synthesis of design prompts for large language models in conceptual design
Conceptual design can be modeled as a proposition making process, where designers make logical propositions to communicate and construct intangible concepts. Not only can LLMs interpret designers’ pro…
The Digital Therapeutic Alliance and Human-Computer Interaction
This conceptual paper explores one such instrument that has been proposed in the literature, the Mobile Agnew Relationship Measure, and examines it through a human-computer interaction (HCI) lens. Thr…
The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas
Large Language Models (LLMs) have shown promise in accelerating the scientific research pipeline. A key capability for this process is the ability to generate novel research ideas, and prior studies h…
Through the Lens of Human-Human Collaboration: A Configurable Research Platform for Exploring Human-Agent Collaboration
Research on LLM agents [68, 69, 82], which are LLM systems capable of exhibiting complex, human-like behaviors to solve tasks, shows these agents2 can yield distinct, believable cognitive and social b…
Toward Reasonable Parrots: Why Large Language Models Should Argue with Us by Design
we advocate for the development of conversational technology that is inherently designed to support and facilitate argumentative processes. We argue that, at present, large language models (LLMs) are …
Towards Algorithmic Experience
Algorithms currently have direct implications in our democracies and societies, but they also define mostly all our daily activities as users, defining our decisions and promoting different behaviors.…
Towards Human-centered Proactive Conversational Agents
Recent research on proactive conversational agents (PCAs) mainly focuses on improving the system’s capabilities in anticipating and planning action sequences to accomplish tasks and achieve goals befo…
Trust in Human-AI Interaction: Scoping Out Models, Measures, and Methods
Trust has emerged as a key factor in people’s interactions with AI-infused systems. Yet, little is known about what models of trust have been used and for what systems: robots, virtual characters, sma…
User-LLM: Efficient LLM Contextualization with User Embeddings
Large language models (LLMs) have revolutionized natural language processing. However, effectively incorporating complex and potentially noisy user interaction data remains a challenge. To address thi…
UserBench: An Interactive Gym Environment for User-Centric Agents
Large Language Models (LLMs)-based agents have made impressive progress in reasoning and tool use, enabling them to solve complex tasks. However, their ability to proactively collaborate with users, e…
Using Large Language Models to Create AI Personas for Replication and Prediction of Media Effects: An Empirical Test of 133 Published Experimental Research Findings
Our LLM replications successfully reproduced 76% of the original main effects (84 out of 111), demonstrating strong potential for AI-assisted replication of studies in which people respond to media st…
Using Large Language Models to Generate, Validate, and Apply User Intent Taxonomies
Log data can reveal valuable information about how users interact with Web search services, what they want, and how satisfied they are. However, analyzing user intents in log data is not easy, especia…
Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties
Human values are crucial to human decision-making. Value pluralism is the view that multiple correct values may be held in tension with one another (e.g., when considering lying to a friend to protect…
Virtual Assistance in Any Context
Abstract Several domain-specific assistants in the form of chatbots have conquered many commercial and private areas. However, there is still a limited level of systematic knowledge of the distinctive…
WHEN TO ACT, WHEN TO WAIT: Modeling Structural Trajectories for Intent Triggerability in Task-Oriented Dialogue
Task-oriented dialogue systems often face difficulties when user utterances seem semantically complete but lack necessary structural information for appropriate system action. This arises because user…
What Makes a Good Natural Language Prompt?
Despite the importance of understanding natural language prompts, there remains limited consensus on how to quantify them. Current approaches rely predominantly on outcome-centric measurements, such a…
Workplace Everyday-Creativity through a Highly-Conversational UI to Large Language Models
We explore everyday co-creativity for collaborative human-AI teams in workplaces via a conversational user interface to a large language model. Previous short papers explored human-AI team-creativity …
Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task
limitations. This study focuses on finding out the cognitive cost of using an LLM in the educational context of writing an essay. We assigned participants to three groups: LLM group, Search Engine gr…