Geometry Conflict: New Research Reveals Why LLMs Forget During Continual Training
Key Takeaways
- ▸Geometry conflict—a measure of covariance geometry alignment between task updates and model state—accurately predicts when LLM continual training causes forgetting versus successful capability transfer
- ▸GCWM improves both retention of existing knowledge and final performance across multiple continual training scenarios without replay data, outperforming existing data-free update-integration methods
- ▸Continual post-training effectiveness depends on geometric compatibility between updates, not just update magnitude or other traditional metrics like gradient conflict or subspace alignment
Summary
Researchers from The Hong Kong Polytechnic University have published a groundbreaking study on a critical challenge in deploying large language models: catastrophic forgetting during continual post-training. The research introduces 'geometry conflict,' a novel concept that explains when sequential updates cause LLMs to lose existing knowledge. Rather than viewing forgetting as simply caused by large parameter updates, the researchers show it is a 'state-relative update-integration failure'—occurring when new task updates become geometrically incompatible with the model's current state shaped by previous training.
The team proposes Geometry-Conflict Wasserstein Merging (GCWM), a data-free method that uses geometric compatibility as a control signal to integrate new updates while preserving existing capabilities. When tested across Qwen3 models (0.6B to 14B parameters) in both domain-continual and capability-continual settings, GCWM consistently outperformed existing data-free baselines without requiring replay data. This work reconceptualizes continual post-training as fundamentally a problem of managing geometric compatibility between sequential updates, offering a practical framework for deploying continuously-improving language models without sacrificing previously learned knowledge.
Editorial Opinion
This research provides a compelling geometric framework for understanding one of the most pressing challenges in scaling language model development. As companies increasingly need to continuously adapt and improve deployed LLMs without catastrophic performance loss, geometry conflict offers both theoretical insight and practical control mechanisms for safer, more efficient continual training—a capability that could significantly accelerate the pace of model improvement across the industry.


