Vercel's AI Gateway Production Index Shows Anthropic Leads in Spend, Google in Volume
Key Takeaways
- ▸Anthropic leads in spending at 61% of April AI Gateway traffic despite higher unit costs, reflecting dominance in high-stakes, reasoning-intensive workloads
- ▸Google's Gemini Flash captured the largest share of token volume (38%) through significantly lower per-token pricing, demonstrating clear market stratification between premium reasoning and cost-efficient inference
- ▸OpenAI's spending share tripled from March to April following recent model releases, indicating strong adoption of updated reasoning capabilities
Summary
Vercel has released its AI Gateway Production Index, providing unprecedented insights into how production AI workloads are actually distributed across different AI models in real-world applications. Based on seven months of production traffic from over 200,000 unique teams routing tens of trillions of tokens through AI Gateway, the report reveals a sophisticated and segmented market: Anthropic leads in overall spending at 61% of April traffic despite higher unit costs, while Google's Gemini Flash has captured the largest share of token volume at 38% through significantly lower per-token pricing. OpenAI's spend share has grown rapidly, tripling from March to April following its recent GPT-5.4 and GPT-5.5 releases.
The analysis reveals a clear economic segmentation where model selection is driven by the cost of errors in each use case. Premium, high-stakes applications tend to use Anthropic's Claude for reasoning-intensive tasks, while high-volume, cost-sensitive workloads route to Google's Gemini Flash and other efficient models. B2B applications spend roughly twice as much per token as B2C applications, and back-office workflows spend significantly more than consumer applications because errors carry greater financial and operational risks.
Agentic workloads have become increasingly dominant, now accounting for 59% of all token volume—up from approximately 30% six months ago. The report shows that no single AI provider dominates across all use cases: Anthropic leads in software building and back-office work, Google over-indexes in consumer applications, and OpenAI is more evenly distributed. Open-source models are gaining traction, suggesting a maturing market where developers select tools based on economics and performance rather than brand affinity.
- Agentic workloads now represent 59% of token volume (up 2x in six months), fundamentally reshaping model selection based on cost-accuracy tradeoffs
- The market segments by use case economics rather than absolute model quality—Anthropic for high-stakes work, Google for volume, OpenAI balanced across categories
Editorial Opinion
Vercel's AI Gateway Production Index provides a rare view of real-world model deployment patterns that traditional benchmarks cannot capture. Rather than a winner-take-all dynamic, the data reveals a mature ecosystem where different AI companies own distinct value layers—Claude for high-stakes reasoning, Gemini Flash for cost-efficient volume, GPT for balanced capabilities. The explosive growth of agentic workloads and the clear segmentation by error cost suggests the future of AI won't be determined by any single best model, but by intelligent routing infrastructure that selects the optimal tool for each task. This points to a new competitive frontier where success belongs to those who master multi-model orchestration and cost optimization.



