28+ papers mapped from one GL framework. Each opens a new domain while pointing back to the same equation. Contact [email protected] for preprints.
Submitted
Paper 3ANeurIPS 2026
“Lying Is Just a Phase”
The Hidden Alignment Transition in Language Model Scaling
Below a family-dependent critical scale Nc, the coupling between reasoning (HellaSwag) and truthfulness (TruthfulQA) is negative — scaling reasoning hurts truthfulness. Above Nc, they cooperate. Nc varies 60× across families (0.12B–7B) and is a design parameter, not a physical constant: width, data curation, and architecture each shift it independently. A coupled ODE cross-predicts held-out Llama-2 at 5.6% MAE. The isocline classifier separates standard-trained from curated families. Curated models (Phi, Qwen3) bypass the tax entirely.
63base models16familiesr = -0.989pre-Nc5.6%ODE MAE
At frontier scale (SWE-bench vs GPQA Diamond, 34+5 models, 10 labs), capabilities remain cooperative (r = +0.72, slope 0.513). The h-field diagnostic — deviation from the cooperation trend — reveals each lab’s training philosophy: Google is reasoning-specialist (h̄ = +5.5), Anthropic is coding-rich (h̄ = −6.9). Per-lab coupling slopes span 5× (Google 1.15 vs DeepSeek 0.23). Tax excursions are temporary — Sonnet 4.6 (h = −13.1) recovers at Opus 4.6 (h = +3.5). The h-field is descriptive, not causal. Seven falsifiable predictions with timestamped deadlines.