BotBeat
...
← Back

> ▌

CloudflareCloudflare
PRODUCT LAUNCHCloudflare2026-04-02

Cloudflare's Workers AI Enters Large Model Inference Market With Moonshot AI's Kimi K2.5

Key Takeaways

  • ▸Workers AI now supports frontier-scale open-source models, starting with Kimi K2.5, enabling end-to-end agent deployment on Cloudflare's platform
  • ▸Cloudflare achieved 77% cost savings versus proprietary models on internal security review agents, demonstrating compelling price-performance advantages
  • ▸The shift reflects market demand for cost-effective large model inference as inference volume skyrockets with widespread adoption of personal and autonomous agents
Source:
Hacker Newshttps://blog.cloudflare.com/workers-ai-large-models/↗

Summary

Cloudflare has announced that its Workers AI platform now supports large-scale language models, starting with Moonshot AI's Kimi K2.5. The move marks a significant expansion of Cloudflare's AI inference capabilities, enabling developers to run complete agentic workflows on a single unified platform combining Durable Objects, Workflows, and Workers infrastructure with frontier-class models.

Kimi K2.5 features a 256k context window and supports multi-turn tool calling, vision inputs, and structured outputs—capabilities critical for agentic applications. Cloudflare has already deployed the model internally for code review automation and security analysis, achieving substantial cost savings. The company reported a 77% cost reduction compared to mid-tier proprietary models on a security review agent processing 7 billion tokens daily.

The expansion reflects a broader industry shift toward open-source models as inference volume explodes with the proliferation of personal and autonomous agents. Cloudflare positions Workers AI as a cost-effective alternative to proprietary models for enterprises scaling agent deployments, addressing what it sees as the primary blocker to widespread AI adoption: pricing and operational costs.

  • Kimi K2.5's 256k context window and agentic capabilities make it well-suited for complex, multi-turn reasoning tasks in code review and security analysis

Editorial Opinion

Cloudflare's move to support large open-source models directly addresses a critical pain point in agent economics—as inference costs become the primary blocker to scaling, enterprises will increasingly migrate from proprietary to open-source alternatives. The 77% cost reduction on real internal workloads is striking and suggests that frontier open-source models have finally closed the capability gap that previously justified premium pricing. This democratization of large model access through competitive infrastructure providers could fundamentally reshape AI deployment economics.

Large Language Models (LLMs)Generative AIAI AgentsMLOps & InfrastructureProduct Launch

More from Cloudflare

CloudflareCloudflare
UPDATE

Cloudflare Rebuilds Browser Run on Containers for 4x Better Performance and Scale

2026-05-14
CloudflareCloudflare
FUNDING & BUSINESS

Cloudflare Cuts 1,100 Workers (20% of Staff) as AI Transforms Operations

2026-05-09
CloudflareCloudflare
FUNDING & BUSINESS

Cloudflare Lays Off 1,100 Employees to Prepare for 'Agentic AI Era'

2026-05-07

Comments

Suggested

Google / AlphabetGoogle / Alphabet
PRODUCT LAUNCH

Google DeepMind Launches Gemini 3.5 Flash: New Lightweight AI Model

2026-05-20
Executive Office of the President of the United States (Policy/Regulation)Executive Office of the President of the United States (Policy/Regulation)
RESEARCH

SID Achieves Search Breakthrough with SID-1, Outperforming GPT-5 at 1k+ QPS Using Reinforcement Learning

2026-05-20
AnthropicAnthropic
POLICY & REGULATION

Advanced AI Models Bring Government to 'Reflection Point,' CIA Official Says

2026-05-20
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us