BotBeat
...
← Back

> ▌

JetBrainsJetBrains
PRODUCT LAUNCHJetBrains2026-06-02

JetBrains Releases Mellum2: Efficient 12B Mixture-of-Experts Model for Production AI Systems

Key Takeaways

  • ▸Mellum2 activates only 2.5B of its 12B parameters per token, delivering 2x faster inference than comparable models while maintaining competitive performance
  • ▸Designed as a specialized 'focal' model for routing, RAG, summarization, and agent subtasks within larger AI systems rather than as a general-purpose replacement
  • ▸Open-source release (Apache 2.0) with weights on Hugging Face enables private, self-hosted deployment for organizations handling sensitive code and data
Source:
Hacker Newshttps://huggingface.co/blog/JetBrains/mellum2-launch↗

Summary

JetBrains has released Mellum2, a 12-billion-parameter Mixture-of-Experts model trained on natural language and code, optimized for efficient, low-latency inference in production AI systems. The model activates only 2.5B parameters per token, enabling more than 2x faster inference compared to similarly-sized models while maintaining competitive benchmark performance across code generation, reasoning, science, and math tasks.

Mellum2 is designed as a "focal" model—a fast, specialized component optimized for high-frequency operations within larger AI systems rather than as a general-purpose replacement. The company positions it for use cases like routing and orchestration, retrieval-augmented generation (RAG), summarization, agent subtasks, and code-aware features. JetBrains emphasizes that modern production AI systems increasingly rely on multiple specialized models, and Mellum2 targets latency-sensitive operations that don't require frontier-scale models.

Released under the Apache 2.0 license with weights available on Hugging Face, Mellum2 can be deployed in self-hosted environments, making it suitable for organizations with proprietary code or privacy requirements. The release includes a full technical report detailing architecture, training methodology, and comprehensive benchmarks, underscoring JetBrains' commitment to open-source AI infrastructure.

  • Specialized for text and code workloads, reflecting JetBrains' focus on software engineering use cases

Editorial Opinion

Mellum2 represents a thoughtful departure from the race toward larger, more general-purpose models. By releasing an efficient, specialized model optimized for specific high-frequency tasks, JetBrains acknowledges a practical reality: production AI systems don't need a frontier model doing every job. This 'focal model' approach—pairing fast, task-specific models with larger reasoning models—is likely to become increasingly valuable as organizations seek to balance cost, latency, and capability. JetBrains' open-source licensing also lowers barriers to adoption for teams building internal AI infrastructure.

Large Language Models (LLMs)Generative AIProduct LaunchOpen Source

More from JetBrains

JetBrainsJetBrains
OPEN SOURCE

JetBrains Open-Sources Mellum2: Fast, Efficient LLM for Production AI Workflows

2026-06-01
JetBrainsJetBrains
PRODUCT LAUNCH

JetBrains Announces 2026 AI Strategy: Agent Client Protocol and Multi-Provider Support

2026-04-29
JetBrainsJetBrains
INDUSTRY REPORT

JetBrains Reveals Six-Figure AI Adoption as Developer Tools Giant Opens Platform to Multiple AI Providers

2026-04-27

Comments

Suggested

Alibaba (Cloud)Alibaba (Cloud)
RESEARCH

Local AI Handwriting Recognition Finally Becomes Practical with Open-Source Models

2026-06-02
AnthropicAnthropic
PRODUCT LAUNCH

Anthropic Releases Claude Opus 4.8: Enhanced Honesty and Dynamic Workflows Advance Agentic AI

2026-06-02
Community Research / Recurse CenterCommunity Research / Recurse Center
RESEARCH

Zork-Bench: Researchers Launch LLM Reasoning Evaluation Framework Based on Text Adventure Games

2026-06-02
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us