Gemini's Cache Feature Bug Causes $1,000 Per Hour Billing Overcharges
Key Takeaways
- ▸A critical bug in Gemini's cache feature is overcharging users at rates of approximately $1,000 per hour
- ▸The billing miscalculation appears to affect how cached API operations are priced, undermining a cost-saving feature
- ▸This incident underscores the risks of billing system complexity in AI services and the importance of thorough production monitoring
Source:
Loading tweet...
Summary
Google's Gemini AI model is experiencing a significant billing bug related to its cache feature, charging users approximately $1,000 per hour due to improper cost calculations. The issue, reported by jrflowers, appears to stem from how the caching system applies billing rates, resulting in unexpectedly high charges for what should be discounted cached operations. Cache features are designed to reduce inference costs by reusing previously computed context, but the bug is having the opposite effect. This incident highlights the operational challenges AI companies face when scaling complex, multi-component billing systems for high-volume API services.


