Does model access level determine which specialization techniques work?
Different specialization approaches require different levels of access to a model's internals. Understanding this constraint helps practitioners choose realistic techniques for their domain adaptation goals.
The domain specialization survey organizes the technique landscape around a single governing variable: how much access does the practitioner have to the model's internals? This produces three tiers that are not just organizational — they determine the ceiling on what specialization can achieve.
Black-box (external augmentation): No access to model parameters, gradients, or loss values. Techniques: RAG, tool use, output post-processing, prompt injection. Domain knowledge is incorporated into the input or used to filter the output. The model itself is unchanged. This is the most accessible tier — any API user can apply it — but the specialization is shallow: the model applies pre-existing general capabilities to domain-enriched prompts. Knowledge that isn't explicitly in the context window cannot be activated.
Grey-box (prompt crafting): Access to gradient or loss values, allowing finer control over model behavior without modifying parameters. Techniques: continuous prompt tuning, soft prompts, learnable prompt vectors. The model's behavior is shaped by optimized prompt representations rather than natural language instructions. More powerful than discrete prompting because the optimization happens in embedding space rather than token space, but still does not change the underlying parameter distribution.
White-box (model fine-tuning): Full access to model parameters. Techniques: full fine-tuning, LoRA, adapter layers, continued pre-training. Domain knowledge is incorporated directly into model weights. Most powerful but most resource-intensive — requires domain-specific datasets, compute, and expertise. Also carries the highest risk of Why do specialized models fail outside their domain?.
The access level is usually determined by organizational context rather than technical preference. API-only deployment (black-box) covers most enterprise use. Gradient access requires model weights (grey-box). Parameter modification requires infrastructure to train (white-box).
This taxonomy matters because practitioners often default to prompt-based approaches without recognizing that prompt optimization is bounded by Can prompt optimization teach models knowledge they lack?. When the required domain knowledge isn't in the model's training distribution, no amount of prompting will supply it — the tier must change.
Inquiring lines that use this note as a source 7
This note is a source for these synthesized inquiries. Follow a line forward into its question, or open it to trace back to all of its sources.
- What causes models to develop domain capability cliffs after specialization?
- What access constraints allow description-based adaptation but block conventional techniques?
- Do different domains require different types of model investment?
- How does over-specialization create capability cliffs outside target domains?
- How do trait adapters interact with different base model architectures?
- What happens when you project the same model onto different harnesses?
- What distinctive properties make open foundation models different from closed ones?
Related concepts in this collection 3
This note in its neighbourhood — explore the map, then jump to a related concept in the list below.
Click a node to walk · click center to open · click Open in graph to see this note in the full knowledge graph
-
Can prompt optimization teach models knowledge they lack?
Explores whether sophisticated prompting techniques can inject new domain knowledge into language models, or if they're limited to activating existing training knowledge.
the fundamental ceiling on black-box and grey-box approaches
-
How do knowledge injection methods trade off flexibility and cost?
When and how should domain knowledge enter an AI system? This explores the speed, training cost, and adaptability trade-offs across four injection paradigms, and when each approach suits different deployment constraints.
orthogonal taxonomy: same techniques, different organizing dimension (runtime vs. training-time)
-
Why do specialized models fail outside their domain?
Deep domain optimization creates sharp performance cliffs at domain boundaries. Specialized models generate plausible-sounding but ungrounded responses when queries fall outside their training scope, and often fail to signal their own ignorance.
white-box fine-tuning creates the highest cliff risk
Related papers in this collection 8
Papers most semantically related to this note, ranked by cosine similarity in the embedding space.
- Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey
- Large Language Models Sensitivity to The Order of Options in Multiple-Choice Questions
- Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering
- Levels of Analysis for Large Language Models
- Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence
- RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems
- Divide-or-Conquer? Which Part Should You Distill Your LLM?
- Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study
Original note title
domain specialization access taxonomy — black box grey box white box determines available techniques