BotBeat
...
← Back

> ▌

CloudflareCloudflare
PRODUCT LAUNCHCloudflare2026-04-02

Cloudflare's Workers AI Enters Large Model Inference Market With Moonshot AI's Kimi K2.5

Key Takeaways

  • ▸Workers AI now supports frontier-scale open-source models, starting with Kimi K2.5, enabling end-to-end agent deployment on Cloudflare's platform
  • ▸Cloudflare achieved 77% cost savings versus proprietary models on internal security review agents, demonstrating compelling price-performance advantages
  • ▸The shift reflects market demand for cost-effective large model inference as inference volume skyrockets with widespread adoption of personal and autonomous agents
Source:
Hacker Newshttps://blog.cloudflare.com/workers-ai-large-models/↗

Summary

Cloudflare has announced that its Workers AI platform now supports large-scale language models, starting with Moonshot AI's Kimi K2.5. The move marks a significant expansion of Cloudflare's AI inference capabilities, enabling developers to run complete agentic workflows on a single unified platform combining Durable Objects, Workflows, and Workers infrastructure with frontier-class models.

Kimi K2.5 features a 256k context window and supports multi-turn tool calling, vision inputs, and structured outputs—capabilities critical for agentic applications. Cloudflare has already deployed the model internally for code review automation and security analysis, achieving substantial cost savings. The company reported a 77% cost reduction compared to mid-tier proprietary models on a security review agent processing 7 billion tokens daily.

The expansion reflects a broader industry shift toward open-source models as inference volume explodes with the proliferation of personal and autonomous agents. Cloudflare positions Workers AI as a cost-effective alternative to proprietary models for enterprises scaling agent deployments, addressing what it sees as the primary blocker to widespread AI adoption: pricing and operational costs.

  • Kimi K2.5's 256k context window and agentic capabilities make it well-suited for complex, multi-turn reasoning tasks in code review and security analysis

Editorial Opinion

Cloudflare's move to support large open-source models directly addresses a critical pain point in agent economics—as inference costs become the primary blocker to scaling, enterprises will increasingly migrate from proprietary to open-source alternatives. The 77% cost reduction on real internal workloads is striking and suggests that frontier open-source models have finally closed the capability gap that previously justified premium pricing. This democratization of large model access through competitive infrastructure providers could fundamentally reshape AI deployment economics.

Large Language Models (LLMs)Generative AIAI AgentsMLOps & InfrastructureProduct Launch

More from Cloudflare

CloudflareCloudflare
INDUSTRY REPORT

Cloudflare Report: Agentic Internet Accelerates—50% of Web Traffic Now Non-Human

2026-07-02
CloudflareCloudflare
POLICY & REGULATION

Cloudflare Sets AI Crawler Deadline: Separate Search or Be Blocked

2026-07-02
CloudflareCloudflare
UPDATE

Cloudflare Introduces Nuanced AI Traffic Classification Beyond Binary Blocking

2026-07-01

Comments

Suggested

MicrosoftMicrosoft
RESEARCH

Microsoft's Leaked 'Aion' Project Reveals Vision for Copilot-First Operating System

2026-07-04
Google / AlphabetGoogle / Alphabet
RESEARCH

Stanford Researchers Use Multi-Agent AI and Reinforcement Learning to Improve HIP Kernel Generation for AMD GPUs

2026-07-04
LLM Agent EcosystemLLM Agent Ecosystem
RESEARCH

Researchers Expose Critical Payload-Less Attack on LLM Agent Supply Chains

2026-07-04
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us