reasoning-core: Open-Source 130M-Param Guardrail Cuts AI Agent Token Usage by Up to 29%
Key Takeaways
- ▸reasoning-core's 130M-parameter supervisor model reduces token consumption by 8–29% across coding tasks while keeping agents aligned to development plans and repository standards
- ▸The tool runs 100% locally (200MB RAM footprint) with no cloud relay or telemetry, ensuring code privacy and zero external data leakage compared to vanilla Claude Code
- ▸Evaluation showed statistically significant gains: plan quality improved +0.32 points (1–5 scale), agent on-plan compliance +0.23, and code consistency +0.43 by reusing existing repo patterns
Summary
A new open-source project called reasoning-core has been released, offering a lightweight locally-run supervisor system for AI code agents. Created by developer jakubkrzysztofsikora, the tool uses a 130M-parameter model (~200MB RAM) to guard against off-plan code changes and token waste. The guardrail can be deployed as a sidecar to Claude Code, Gemini CLI, GitHub Copilot CLI, and Mistral, preventing agents from modifying files outside their intended scope. In benchmarks across 8 real engineering tasks, reasoning-core achieved token savings of up to 29% on individual tasks and 8.2% on average, while improving plan quality and adherence to repository conventions. The tool runs entirely locally with no cloud telemetry, maintaining complete code privacy.
- Shadow mode deployment lets teams observe guardrail decisions without enforcing them, enabling safe rollout and observation before activation across a codebase
- Multi-CLI support (Claude Code, Gemini, Copilot, Mistral) makes it portable across AI agent ecosystems; repository-scoped via direnv isolates it from other folders
Editorial Opinion
reasoning-core addresses a genuine friction point in AI-assisted development: capable agents that lack organizational discipline. By deploying a lightweight 130M-parameter supervisor locally, the project proves that effective code governance doesn't require expensive cloud infrastructure or third-party oversight. While the 8.2% average token savings are meaningful rather than transformative, the bigger wins are in plan adherence and code consistency—problems that cause real friction in team workflows. The shadow-mode design is particularly smart, letting teams evaluate guardrail behavior on their own codebase before enforcement. This pragmatic, privacy-first approach to agent alignment deserves attention as a model for how developer tooling can remain lightweight and locally-controllable even at scale.


