BotBeat
...
← Back

> ▌

CloudflareCloudflare
PRODUCT LAUNCHCloudflare2026-04-02

Cloudflare's Workers AI Enters Large Model Inference Market With Moonshot AI's Kimi K2.5

Key Takeaways

  • ▸Workers AI now supports frontier-scale open-source models, starting with Kimi K2.5, enabling end-to-end agent deployment on Cloudflare's platform
  • ▸Cloudflare achieved 77% cost savings versus proprietary models on internal security review agents, demonstrating compelling price-performance advantages
  • ▸The shift reflects market demand for cost-effective large model inference as inference volume skyrockets with widespread adoption of personal and autonomous agents
Source:
Hacker Newshttps://blog.cloudflare.com/workers-ai-large-models/↗

Summary

Cloudflare has announced that its Workers AI platform now supports large-scale language models, starting with Moonshot AI's Kimi K2.5. The move marks a significant expansion of Cloudflare's AI inference capabilities, enabling developers to run complete agentic workflows on a single unified platform combining Durable Objects, Workflows, and Workers infrastructure with frontier-class models.

Kimi K2.5 features a 256k context window and supports multi-turn tool calling, vision inputs, and structured outputs—capabilities critical for agentic applications. Cloudflare has already deployed the model internally for code review automation and security analysis, achieving substantial cost savings. The company reported a 77% cost reduction compared to mid-tier proprietary models on a security review agent processing 7 billion tokens daily.

The expansion reflects a broader industry shift toward open-source models as inference volume explodes with the proliferation of personal and autonomous agents. Cloudflare positions Workers AI as a cost-effective alternative to proprietary models for enterprises scaling agent deployments, addressing what it sees as the primary blocker to widespread AI adoption: pricing and operational costs.

  • Kimi K2.5's 256k context window and agentic capabilities make it well-suited for complex, multi-turn reasoning tasks in code review and security analysis

Editorial Opinion

Cloudflare's move to support large open-source models directly addresses a critical pain point in agent economics—as inference costs become the primary blocker to scaling, enterprises will increasingly migrate from proprietary to open-source alternatives. The 77% cost reduction on real internal workloads is striking and suggests that frontier open-source models have finally closed the capability gap that previously justified premium pricing. This democratization of large model access through competitive infrastructure providers could fundamentally reshape AI deployment economics.

Large Language Models (LLMs)Generative AIAI AgentsMLOps & InfrastructureProduct Launch

More from Cloudflare

CloudflareCloudflare
RESEARCH

Cloudflare Rethinking Cache Architecture for AI-Driven Traffic Era

2026-04-02
CloudflareCloudflare
PRODUCT LAUNCH

Cloudflare Slashes AI Agent Token Costs by 98% With RFC 9457-Compliant Error Responses

2026-03-27
CloudflareCloudflare
PRODUCT LAUNCH

Cloudflare Launches Dynamic Workers to Run AI Agent Code 100x Faster Without Containers

2026-03-26

Comments

Suggested

AnthropicAnthropic
RESEARCH

Inside Claude Code's Dynamic System Prompt Architecture: Anthropic's Complex Context Engineering Revealed

2026-04-05
OracleOracle
POLICY & REGULATION

AI Agents Promise to 'Run the Business'—But Who's Liable When Things Go Wrong?

2026-04-05
Google / AlphabetGoogle / Alphabet
RESEARCH

Deep Dive: Optimizing Sharded Matrix Multiplication on TPU with Pallas

2026-04-05
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us