Why do transformer models still miss implicit discourse relations in anxiety detection?

This explores why transformers struggle to read the *relationships between statements* — the causal, cross-sentence reasoning that signals anxiety — rather than just the emotional words on the surface. The corpus points to a clear culprit: anxiety doesn't live in vocabulary, it lives in discourse. Anxious thinking shows up as overgeneralization built through chains of causal reasoning across statements — "because X happened, Y will fail, which means Z is hopeless." Discourse-level causal features predict anxiety more accurately than any single word, and the best results come from a dual model that reads both levels at once Why do discourse patterns predict anxiety better than single words?. A model tuned to spot worried-sounding words will keep missing the inter-statement logic that actually does the diagnostic work.

Why is that cross-statement layer so hard for transformers to pick up? Part of the answer is what training rewards. Models are optimized to predict information, not to track the implicit relational and structural work that holds a stretch of discourse together — which is exactly why they don't naturally develop conversation-maintenance moves like reference repair or topic hand-off Why don't language models develop conversation maintenance skills?. Implicit discourse relations belong to that same under-rewarded category: nobody labels them, and the training signal doesn't push the model to encode them, so they stay latent.

There's also a deeper representational story. Transformers tend to carry knowledge as continuous flow through the residual stream rather than as stored, addressable facts — knowledge that's contextual and inseparable from the act of generating Do transformer models store knowledge or generate it continuously?. And when context conflicts with strong priors learned in training, the priors win: models generate outputs inconsistent with what's actually in front of them, and prompting alone can't fix it Why do language models ignore information in their context?. An implicit discourse relation *is* a piece of context that has to be integrated against the model's defaults — so the same failure mode that makes models ignore their context shows up as missed inter-statement reasoning.

The clinical-dialogue notes sharpen the stakes. When transformers do try to read between the lines, they tend to over-read — injecting emotional interpretations the user never expressed Do language models add feelings users never actually expressed? — or they default to problem-solving the moment someone discloses emotion, a hallmark of low-quality therapy driven by RLHF's helpfulness bias Do LLM therapists respond to emotions like low-quality human therapists?. So it's not simply that models under-detect discourse signal; they're miscalibrated about it in both directions, hallucinating relations that aren't there while missing the ones that are.

The quietly surprising part: the discourse signal may already be inside the network, just not surfaced. Transformers compute correct intermediate reasoning in early layers and then overwrite it to satisfy output formatting — the real computation is recoverable from lower-ranked predictions even when the final tokens are filler Do transformers hide reasoning before producing filler tokens?. That reframes the whole question: the fix for implicit discourse relations might be less about adding capability and more about not discarding the cross-statement reasoning the model already performs — through dual-level architectures Why do discourse patterns predict anxiety better than single words? or uncertainty-aware objectives that let a model abstain instead of guessing when the relation is ambiguous Can models learn to abstain when uncertain about predictions?.

Sources 8 notes

Why do discourse patterns predict anxiety better than single words?

Causal explanations across statements—not individual words—are the strongest predictor of anxiety because anxious thinking involves overgeneralization through inter-statement reasoning. A dual model combining both representation levels outperforms either alone.

Why don't language models develop conversation maintenance skills?

Humans keep conversations smooth through implicit techniques like reference repair and topic hand-off that sustain relational interaction, not convey information. Language models don't develop these because training signals reward information prediction, not relational work.

Do transformer models store knowledge or generate it continuously?

Transformers organize knowledge as flowing activations rather than retrievable archives, mirroring oral cultures where knowledge exists only in performance. This explains why model knowledge is contextual, difficult to edit, and inseparable from generation.

Why do language models ignore information in their context?

Research demonstrates that LMs generate outputs inconsistent with their context because parametric knowledge from training dominates over in-context information. Textual prompting alone cannot override strong priors; causal intervention in representations is required.

Do language models add feelings users never actually expressed?

Therapists reviewing GPT-4 in the CaiTI system found it "reads into" user feelings rather than responding objectively. Task decomposition across specialized models (Reasoner/Guide/Validator) reduces but does not eliminate this interpretation bias.

Do LLM therapists respond to emotions like low-quality human therapists?

Using the BOLT framework, researchers found LLMs offer solution-focused advice during emotional disclosure—a hallmark of low-quality therapy—yet also reflect more on client needs and strengths than typical poor human therapy, creating an unusual hybrid profile likely driven by RLHF's helpfulness bias.

Do transformers hide reasoning before producing filler tokens?

Logit lens analysis shows models trained with hidden CoT tokens compute correct answers in layers 1-3, then actively suppress these representations in final layers to produce format-compliant filler output. The reasoning is fully recoverable from lower-ranked token predictions.

Can models learn to abstain when uncertain about predictions?

Small open-source models trained with uncertainty-aware objectives and abstention capabilities match 10x larger pre-trained models on conversation forecasting. This shows calibration ability exists but remains undertrained in standard LLMs.

Why do transformer models still miss implicit discourse relations in anxiety detection?

Sources 8 notes

Next inquiring lines