Amazon Closes AI Ranking System After Employees Game Metrics with 'Tokenmaxxing'
Key Takeaways
- ▸Amazon discontinued Kirorank after discovering employees were gaming the system by artificially inflating AI token usage to climb rankings
- ▸The practice of 'tokenmaxxing'—running unnecessary AI agents—wasted expensive computational resources at a company investing $200 billion in infrastructure annually
- ▸Amazon is shifting to productivity-focused metrics like 'normalized deployments' to measure actual code quality and usefulness rather than raw token consumption
Summary
Amazon has discontinued Kirorank, an internal ranking system that measured how much AI employees used on the company's 'Kiro' development platform. The metric, created with good intentions to encourage AI adoption, became toxic as employees began "tokenmaxxing"—artificially inflating their usage by running unnecessary AI agents to climb rankings without producing meaningful work.
The gaming of the system highlighted a costly problem: thousands of employees running wasteful AI tasks to boost their scores meant burning through expensive computational resources. With AI model inference costing money and Amazon investing billions annually in AI infrastructure, the incentive misalignment became untenable. Dave Treadwell, a VP at Amazon, acknowledged the ranking had produced perverse incentives, stating "Don't use AI just to use AI."
Amazon is pivoting to outcome-focused metrics, such as "normalized deployments," which measure whether AI actually helps generate useful code rather than simply tracking token consumption. The company remains committed to AI adoption, with an ambitious goal for 80% of developers to use the technology weekly—but through better-aligned incentives.
- The incident demonstrates the dangers of poorly designed incentive systems in AI adoption and highlights the importance of measuring outcomes, not activity



