SYNTHESIS NOTE

Can language models balance competing ethical norms in context?

Do LLMs genuinely weigh trade-offs between honesty, helpfulness, and harm prevention based on what a specific conversation needs, or do they rigidly enforce fixed corporate values regardless of situation?

Synthesis note · 2026-05-01 · sourced from Conversation Topics Dialog

Gricean pragmatics insists on situated normativity: speakers do not blindly follow maxims (quantity, quality, relation, manner) but apply, suspend, violate, or exploit them according to context. When a doctor withholds a terminal diagnosis from a frightened patient, the doctor violates the maxim of quantity to uphold compassion. The violation is not a failure — it is the right move in context, and a competent hearer recognizes it as such. Pragmatic competence is the ability to navigate these conflicts, not the ability to maximize each maxim independently.

LLMs trained on the helpful-honest-harmless triad cannot perform this kind of contextual reasoning. The corporate persona is fixed at the model level: when a user asks for accessible simplification of a complex topic for a child, the model trained for honesty refuses to soften because softening reads as less accurate. When a user asks for sarcastic humor, the model trained for harmlessness refuses to play. The user cannot persuade the model to relax its norms because the norms are structural defaults rather than negotiable conversational moves.

Kasirzadeh and Gabriel describe this as pragmatic dissonance. The model mechanically enforces global norms even when local context demands tailored adherence. The result is communication that adheres to ethical principles at the cost of pragmatic appropriateness — exactly the trade-off that situated normativity is meant to navigate. What humans treat as a single integrated competence becomes, in the LLM, two separate layers in tension with each other.

Inquiring lines that use this note as a source 36

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related concepts in this collection 2

This note in its neighbourhood — explore the map, then jump to a related concept in the list below.

Concept map

15 direct connections · 155 in 2-hop network ·dense cluster Open in graph ↗

Can language models balance competing ethical no… When should human values enter the LLM development… Can human-centered LLM design ever achieve univers…

Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph

your link semantically near linked from elsewhere

When should human values enter the LLM development pipeline? Explores whether human-centered concerns like safety and fairness work better as early design principles throughout development, or as post-training alignment patches. Matters because pipeline placement determines whether human priorities shape the foundation or fight against it.
exemplifies the post-training-patch failure: a fixed corporate persona set late cannot perform context-specific human-centered reasoning the upstream pipeline should encode
Can human-centered LLM design ever achieve universal solutions? If harm and benefit depend on who you ask and how you measure them, can we design LLM systems that satisfy all stakeholders? This explores why broad values like safety and justice resist one-size-fits-all implementation.
exemplifies the frozen-operationalization danger: a fixed corporate persona encodes one developer-chosen reading of harm rather than a revisable stakeholder process

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

LLM refusals and tone choices reflect overarching corporate values rather than context-specific Gricean norm-balancing

Can language models balance competing ethical norms in context?

Related concepts in this collection 2

Related papers in this collection 8

Search by related questions 4