Research Reveals Reasoning LLMs May Decide Before They Think: Early-Encoded Decisions Shape Chain-of-Thought
Key Takeaways
- ▸Linear probes can decode tool-calling decisions from pre-generation activations before reasoning tokens are generated, suggesting early decision encoding in reasoning LLMs
- ▸Activation steering experiments show that perturbing decision directions causes behavioral flips in 7-79% of cases, with models rationalizing rather than resisting the changes
- ▸Chain-of-thought reasoning appears to serve as post-hoc rationalization of pre-encoded decisions rather than genuine deliberation
Summary
A new research paper titled "Therefore I am. I Think" challenges our understanding of how large language reasoning models make decisions, suggesting that these models may encode action choices before they begin their deliberative reasoning process. The researchers demonstrate that simple linear probes can decode tool-calling decisions from pre-generation neural activations with high confidence, and in some cases even before a single reasoning token is produced. Through activation steering experiments, the team shows that perturbing decision-encoding directions leads to inflated deliberation and behavioral reversals in 7-79% of cases depending on the model and benchmark tested.
The findings reveal a surprising disconnect between the apparent reasoning process and underlying decision-making. When steering interventions change a model's decision, the subsequent chain-of-thought often rationalizes the flip rather than resisting it, suggesting the reasoning text serves more as post-hoc justification than genuine deliberation. These results indicate that reasoning models may have early-encoded commitments to specific actions that shape and constrain their subsequent "thinking" process, raising important questions about the authenticity of reasoning in current LLMs.
- The research suggests reasoning LLMs may decide first and think second, challenging assumptions about how these models reason
Editorial Opinion
This research provides a sobering perspective on current reasoning LLMs, suggesting that their deliberative capabilities may be more illusory than substantive. If decisions are genuinely pre-encoded and merely rationalized afterward, it raises critical questions about whether these models possess authentic reasoning abilities or sophisticated confabulation mechanisms. Understanding this phenomenon is essential not only for AI development but also for responsible deployment and accurate characterization of LLM capabilities in high-stakes domains.



