BotBeat
...
← Back

> ▌

agiwhitelistagiwhitelist
PRODUCT LAUNCHagiwhitelist2026-06-16

Tokdiet: New LLM Proxy Cuts API Costs 71% While Maintaining Quality Parity

Key Takeaways

  • ▸71% token reduction (5.07M → 1.46M) with 95-97% quality parity proven through A/B testing on 66 real tasks across two models
  • ▸Cache-aware design preserves prompt caching optimization benefits, avoiding the cache invalidation problem that affects naive context optimization
  • ▸Available as Claude Code plugin with simple setup (npx tokdiet start) and transparent proxy routing without requiring code changes
Source:
Hacker Newshttps://github.com/agiwhitelist/tokdiet↗

Summary

agiwhitelist has launched Tokdiet, a local proxy that optimizes context for LLM API requests and reduces token consumption by approximately 71% without sacrificing output quality. The tool sits between AI agents and model APIs, intelligently compacting context while preserving information relevance. The creator demonstrated this through rigorous A/B testing: across 66 real-world tasks on MiniMax models, Tokdiet reduced input tokens from 5.07M to 1.46M while maintaining 95-97% quality parity compared to baseline full-context runs.

Unlike existing context optimization tools that achieve cost savings through blind pruning, Tokdiet is designed with awareness of modern LLM optimizations like prompt caching, ensuring it doesn't invalidate existing cache benefits. The tool also implements a fail-open architecture, reverting to transparent passthrough if any internal error occurs, guaranteeing it will never break production requests. Tokdiet is available as a Claude Code plugin and can be deployed locally via npx, supporting Claude, OpenAI, and other LLM providers.

  • Fail-open safety architecture ensures the proxy never breaks production requests, automatically falling back to passthrough on error
  • Security-first design keeps API keys local and never logs credentials to disk

Editorial Opinion

Tokdiet addresses a real and growing pain point in LLM application development—token costs—with an unusual level of transparency about the cost-quality tradeoff. Most token optimization tools either hide their impact on output quality or achieve savings through blind pruning that risks degrading model performance. By publicly releasing A/B test results showing 71% cost reduction with measurable quality parity, Tokdiet sets a new standard for responsible cost optimization. For developers running Claude, OpenAI, or other models at scale, this cache-aware approach could meaningfully impact infrastructure costs without the typical quality penalties.

Large Language Models (LLMs)Generative AIMachine LearningMLOps & InfrastructureProduct LaunchOpen Source

Comments

Suggested

Wolfram ResearchWolfram Research
PRODUCT LAUNCH

Wolfram Language 15 Launches With Embedded AI, Deepening Integration With Large Language Models

2026-06-16
DeepSeekDeepSeek
OPEN SOURCE

cwcode: Open-Source Terminal Coding Agent Optimized for DeepSeek V4 and Local LLMs

2026-06-16
TransfigureTransfigure
PRODUCT LAUNCH

Transfigure Launches αLPHA: AI Converts 2D Images to 3D CAD Files

2026-06-16
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us