SYNTHESIS NOTE
Conversational AI and Personalization Psychology, Society, and Alignment Model Architecture and Internals

Can chatbots learn new knowledge without losing their personality?

Character chatbots struggle to absorb domain knowledge through fine-tuning because it erases their distinctive personality traits. Can model merging techniques separate and preserve persona while adding factual knowledge?

Synthesis note · 2026-04-18 · sourced from Personas Personality
How accurately can language models simulate human personalities? How do you build domain expertise into general AI models?

Character chatbots face a fundamental tension: they need domain knowledge to be useful, but sequential fine-tuning on knowledge datasets causes catastrophic forgetting of persona traits. Chamain (2024) solves this through a two-step model merging approach that exploits the architectural separation between knowledge and personality in transformer layers.

Step one: parameter-wise weight combination of task vectors (instruction-tuned models) and character vectors. This integrates factual knowledge without fully overwriting character behavior. Step two: layer-wise merging of the deeper layers of the character model, which carry more persona-specific stylistic information. The method retains approximately 80% of task-specific performance while maintaining character portrayal ability.

This is notable because it avoids three expensive alternatives: (1) collecting character-specific training data for every domain, (2) training from scratch, and (3) multi-task learning requiring balanced datasets. Model merging treats persona and knowledge as independently trained capabilities that can be composed post hoc.

The broader implication connects to Can we track and steer personality shifts during model finetuning?: personality and knowledge occupy partially separable subspaces in model parameters, and this separability can be exploited architecturally. Chamain works at the weight level while persona vectors work at the activation level, but both depend on the same underlying phenomenon — personality traits are localized enough to be preserved or steered independently of task knowledge.

Inquiring lines that use this note as a source 2

This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.

Related papers in this collection 8

Papers most semantically related to this note, ranked by cosine similarity in the embedding space.

Original note title

model merging can integrate domain knowledge into character chatbots without catastrophic forgetting of persona — layer-wise merging preserves style while parameter-wise merging adds knowledge