BotBeat
...
← Back

> ▌

AnthropicAnthropic
PRODUCT LAUNCHAnthropic2026-03-19

llamafile 0.10.0 Released: Rebuilt Framework Now Supports Qwen3.5, Tool Calling, and Anthropic API

Key Takeaways

  • ▸llamafile 0.10.0 fully rebuit to maintain portability while supporting latest llama.cpp models and features
  • ▸Now supports Qwen3.5 multimodal, lfm2 tool calling, and Anthropic Messages API for local Claude-compatible inference
  • ▸Maintains cross-platform APE executable format running on multiple OSes and CPU architectures with CUDA support
Source:
Hacker Newshttps://blog.mozilla.ai/llamafile-reloaded-whats-new-in-v0-10-0/↗

Summary

llamafile 0.10.0 has been released with a complete rebuild that maintains the project's core mission of portable, executable-bundled AI models while incorporating the latest features from llama.cpp. The new version supports advanced capabilities including Qwen3.5 vision models, lfm2 tool calling, and integration with Anthropic's Messages API, allowing users to run Claude-compatible models locally from a single executable file.

The rebuild architecture uses a polyglot approach combining llama.cpp's extensive model support with llamafile's signature portability across different operating systems and CPU architectures. Users can now run multimodal models in terminal chat interfaces, leverage CUDA GPU acceleration on Linux, and benefit from CPU optimizations for various architectures. The project maintains backward compatibility by preserving older releases and model weights on HuggingFace.

The team plans to continue improving feature parity with older versions, simplify the bundling process through a forthcoming llamafile-builder application, and add Vulkan support for broader GPU compatibility. Pre-built llamafiles covering a range of model sizes (0.6B to 27B parameters) and capabilities are available, with options to load custom GGUF model files directly.

  • Future roadmap includes easier bundling via llamafile-builder, Vulkan GPU support, and full feature parity

Editorial Opinion

llamafile 0.10.0 represents a pragmatic evolution of the project, successfully balancing portability with feature completeness by adopting llama.cpp as its foundation. The integration of Anthropic's Messages API and support for cutting-edge models like Qwen3.5 positions llamafile as a serious tool for local AI deployment, though the emphasis on community feedback and feature parity suggests the team is still iterating toward the ideal user experience. The planned llamafile-builder application could be transformative if it lowers barriers to creating custom bundled executables.

Large Language Models (LLMs)Multimodal AIAI AgentsOpen Source

More from Anthropic

AnthropicAnthropic
PARTNERSHIP

Anthropic Expands Partnership with SpaceX, Scales GB200 Capacity in Colossus 2

2026-05-20
AnthropicAnthropic
POLICY & REGULATION

Advanced AI Models Bring Government to 'Reflection Point,' CIA Official Says

2026-05-20
AnthropicAnthropic
RESEARCH

Anthropic Claude Code Sandbox Bypass: Second Vulnerability Exposes Critical Data Exfiltration Risk

2026-05-20

Comments

Suggested

Research CommunityResearch Community
RESEARCH

New Methodology Proposed for Selecting Runtime Architecture Patterns in Production LLM Agents

2026-05-20
Google / AlphabetGoogle / Alphabet
PRODUCT LAUNCH

Google DeepMind Launches Gemini 3.5 Flash: New Lightweight AI Model

2026-05-20
Executive Office of the President of the United States (Policy/Regulation)Executive Office of the President of the United States (Policy/Regulation)
RESEARCH

SID Achieves Search Breakthrough with SID-1, Outperforming GPT-5 at 1k+ QPS Using Reinforcement Learning

2026-05-20
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us