BotBeat
...
← Back

> ▌

AnthropicAnthropic
PRODUCT LAUNCHAnthropic2026-04-01

Tamp Token Compression Proxy Cuts API Costs 50–60% for Coding Agents

Key Takeaways

  • ▸Tamp reduces input token costs by 52.6% for coding agents with zero code changes required
  • ▸Lightweight proxy (70MB RAM, <5ms latency) works with Claude Code, Aider, Cursor, Cline, Windsurf, and other OpenAI-compatible agents
  • ▸Eight default compression stages intelligently process JSON, code, arrays, and other tool outputs before API submission
Source:
Hacker Newshttps://github.com/sliday/tamp↗

Summary

A new open-source token compression proxy called Tamp has been released, enabling developers to reduce input token costs by 52.6% when using AI coding agents with zero code changes. The tool works as a middleware layer between popular coding agents—including Claude Code, Aider, Cursor, Cline, and Windsurf—and API endpoints from Anthropic, OpenAI, and Google, automatically compressing eligible tool outputs like JSON, arrays, and code before forwarding requests upstream.

Tamp employs eight default compression stages including JSON minification, columnar TOON encoding for arrays, line-number stripping, whitespace collapsing, and deduplication of repeated outputs. The proxy is lightweight, requiring only 70MB of RAM with sub-5ms latency, and runs entirely in Node.js without Python dependencies. Installation is straightforward via npm, with a Claude Code plugin available for automatic integration and status monitoring.

The tool supports multiple API formats—Anthropic Messages, OpenAI Chat Completions, and Google Gemini—making it compatible with a wide ecosystem of coding agents. Advanced users can enable optional lossy compression stages via LLMLingua-2 or Ollama/OpenRouter for additional token savings, with full configuration support via environment variables or a persistent config file.

  • Easy installation via npm with Claude Code plugin support for auto-configuration and status monitoring
  • Optional lossy compression modes available for additional token savings via LLMLingua-2 or Ollama

Editorial Opinion

Tamp represents a practical solution to a real cost challenge in the AI agent ecosystem—reducing unnecessary token overhead without requiring developers to refactor their code or change their workflows. The multi-stage compression approach is thoughtfully designed to handle the diverse output types that coding agents produce, from JSON to source code, balancing lossless and optional lossy compression. If the claimed 50–60% savings hold up in real-world usage, this could meaningfully improve economics for teams running coding agents at scale, though the long-term value proposition hinges on whether LLMs themselves eventually optimize for such redundancies natively.

Generative AIAI AgentsMLOps & InfrastructureOpen Source

More from Anthropic

AnthropicAnthropic
POLICY & REGULATION

Anthropic Explores AI's Role in Autonomous Weapons Policy with Pentagon Discussion

2026-04-05
AnthropicAnthropic
POLICY & REGULATION

Security Researcher Exposes Critical Infrastructure After Following Claude's Configuration Advice Without Authentication

2026-04-05
AnthropicAnthropic
POLICY & REGULATION

Anthropic's Claude Code Stores Unencrypted Session Data and Secrets in Plain Text

2026-04-04

Comments

Suggested

OracleOracle
POLICY & REGULATION

AI Agents Promise to 'Run the Business'—But Who's Liable When Things Go Wrong?

2026-04-05
Google / AlphabetGoogle / Alphabet
RESEARCH

Deep Dive: Optimizing Sharded Matrix Multiplication on TPU with Pallas

2026-04-05
AnthropicAnthropic
POLICY & REGULATION

Anthropic Explores AI's Role in Autonomous Weapons Policy with Pentagon Discussion

2026-04-05
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us