LLM Neuroanatomy II: Relayering Works Across Modern Models, Hints at Universal Language in Transformer Reasoning
Key Takeaways
- ▸Relayering (RYS method) is a generalizable property across modern Transformers, not limited to Qwen2-72B, proven across Qwen3.5, MiniMax, and GLM-4.7 models
- ▸LLMs exhibit a three-phase functional anatomy: encoding (early layers), format-agnostic reasoning (middle layers), and decoding (late layers)
- ▸Evidence suggests a universal language or thinking space in middle layers where semantic meaning is encoded independent of input modality (text, code, natural language)
Summary
Researcher Berkeaslan has published findings extending previous work on RYS (Repeat Your Self), a method discovered in mid-2024 where duplicating seven middle layers of Qwen2-72B without weight changes improved model performance. New research across modern models including Qwen3.5-27B, MiniMax M2.5, and GLM-4.7 confirms that relayering is a general property of Transformers, not a one-off fluke. The findings suggest a three-phase structure in LLM architecture: early layers for encoding, middle layers for format-agnostic reasoning in a universal "thinking space," and late layers for decoding.
A groundbreaking experiment by Evan Maunder, refined by Berkeaslan, demonstrates this universal language hypothesis directly. By measuring cosine similarity of hidden states across semantically identical inputs in English, Mandarin, and Base64, researchers showed that middle layers converge to near-perfect similarity regardless of input format. Further analysis of a 6-way comparison using English and Chinese facts and poems suggests that middle layers encode semantic meaning in a language-agnostic manner—content similarity matters more than linguistic form in the model's internal reasoning space.
The research involved scanning 3,024 beam search candidates and validating 2 million configurations using a surrogate model. Berkeaslan has released scanning code and new RYS model variants, particularly focusing on the 27B parameter range, which balances scientific interest with practical accessibility for community users. The findings have implications for understanding model efficiency, interpretability, and the fundamental structure of transformer-based language models.
- Relayering survives in compact 27B models despite greater functional entanglement, indicating robust circuit structure across model scales
- Research involved systematic evaluation of 3,024 candidates and validation of 2 million configurations using surrogate modeling
Editorial Opinion
This research represents a significant step forward in mechanistic interpretability of large language models. The direct observation of format-agnostic reasoning in middle layers—confirmed across multiple input modalities and languages—provides compelling evidence for a universal internal language or reasoning substrate in Transformers. If this finding generalizes across architectures and scales, it could fundamentally reshape how we understand, design, and optimize language models, potentially enabling new efficiency improvements and advancing the broader AI safety agenda through better interpretability.



