BotBeat
...
← Back

> ▌

AnthropicAnthropic
PRODUCT LAUNCHAnthropic2026-04-01

Tamp Token Compression Proxy Cuts API Costs 50–60% for Coding Agents

Key Takeaways

  • ▸Tamp reduces input token costs by 52.6% for coding agents with zero code changes required
  • ▸Lightweight proxy (70MB RAM, <5ms latency) works with Claude Code, Aider, Cursor, Cline, Windsurf, and other OpenAI-compatible agents
  • ▸Eight default compression stages intelligently process JSON, code, arrays, and other tool outputs before API submission
Source:
Hacker Newshttps://github.com/sliday/tamp↗

Summary

A new open-source token compression proxy called Tamp has been released, enabling developers to reduce input token costs by 52.6% when using AI coding agents with zero code changes. The tool works as a middleware layer between popular coding agents—including Claude Code, Aider, Cursor, Cline, and Windsurf—and API endpoints from Anthropic, OpenAI, and Google, automatically compressing eligible tool outputs like JSON, arrays, and code before forwarding requests upstream.

Tamp employs eight default compression stages including JSON minification, columnar TOON encoding for arrays, line-number stripping, whitespace collapsing, and deduplication of repeated outputs. The proxy is lightweight, requiring only 70MB of RAM with sub-5ms latency, and runs entirely in Node.js without Python dependencies. Installation is straightforward via npm, with a Claude Code plugin available for automatic integration and status monitoring.

The tool supports multiple API formats—Anthropic Messages, OpenAI Chat Completions, and Google Gemini—making it compatible with a wide ecosystem of coding agents. Advanced users can enable optional lossy compression stages via LLMLingua-2 or Ollama/OpenRouter for additional token savings, with full configuration support via environment variables or a persistent config file.

  • Easy installation via npm with Claude Code plugin support for auto-configuration and status monitoring
  • Optional lossy compression modes available for additional token savings via LLMLingua-2 or Ollama

Editorial Opinion

Tamp represents a practical solution to a real cost challenge in the AI agent ecosystem—reducing unnecessary token overhead without requiring developers to refactor their code or change their workflows. The multi-stage compression approach is thoughtfully designed to handle the diverse output types that coding agents produce, from JSON to source code, balancing lossless and optional lossy compression. If the claimed 50–60% savings hold up in real-world usage, this could meaningfully improve economics for teams running coding agents at scale, though the long-term value proposition hinges on whether LLMs themselves eventually optimize for such redundancies natively.

Generative AIAI AgentsMLOps & InfrastructureOpen Source

More from Anthropic

AnthropicAnthropic
POLICY & REGULATION

Advanced AI Models Bring Government to 'Reflection Point,' CIA Official Says

2026-05-20
AnthropicAnthropic
RESEARCH

Anthropic Claude Code Sandbox Bypass: Second Vulnerability Exposes Critical Data Exfiltration Risk

2026-05-20
AnthropicAnthropic
RESEARCH

AI Safety Catastrophically Underfunded: Economic Model Reveals Incentive Gap

2026-05-20

Comments

Suggested

Google / AlphabetGoogle / Alphabet
PRODUCT LAUNCH

Google DeepMind Launches Gemini 3.5 Flash: New Lightweight AI Model

2026-05-20
Executive Office of the President of the United States (Policy/Regulation)Executive Office of the President of the United States (Policy/Regulation)
RESEARCH

SID Achieves Search Breakthrough with SID-1, Outperforming GPT-5 at 1k+ QPS Using Reinforcement Learning

2026-05-20
AnthropicAnthropic
POLICY & REGULATION

Advanced AI Models Bring Government to 'Reflection Point,' CIA Official Says

2026-05-20
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us