Mistral Announces Small 4: Unified Multimodal Model Combining Reasoning, Coding, and Chat Capabilities
Key Takeaways
- ▸Mistral Small 4 unifies reasoning, multimodal, and coding capabilities into a single open-source model with 119B parameters and configurable reasoning effort
- ▸The model achieves 40% lower latency and 3x higher throughput than Small 3, while matching or exceeding GPT-OSS 120B performance with more concise outputs
- ▸Native multimodality support (text and image inputs) and a 256k context window enable versatile applications from document analysis to complex reasoning
Summary
Mistral has announced Mistral Small 4, a major release that consolidates the capabilities of three specialized models—Magistral (reasoning), Pixtral (multimodal), and Devstral (coding)—into a single, versatile model. The 119B-parameter hybrid model features a Mixture of Experts architecture with 128 experts and 4 active per token, delivering 6B active parameters and a 256k context window. Released under the Apache 2.0 license, Small 4 introduces configurable reasoning effort, allowing users to toggle between fast, low-latency responses and deep, reasoning-intensive outputs depending on task requirements.
The model demonstrates significant performance improvements over its predecessor, with a 40% reduction in end-to-end completion time and 3x more requests per second in throughput-optimized setups. Small 4 supports both text and image inputs natively, making it suitable for document parsing, visual analysis, chat, coding, and complex reasoning tasks. The model achieves competitive performance on benchmarks, matching or surpassing GPT-OSS 120B while generating significantly shorter outputs—a key indicator of efficiency. Mistral is joining NVIDIA's Nemotron Coalition as a founding member and has optimized the model for deployment on platforms including vLLM, llama.cpp, SGLang, and Transformers.
- Released under Apache 2.0 license with optimizations for NVIDIA hardware and integration across major inference frameworks
Editorial Opinion
Mistral Small 4 represents a meaningful evolution in open-source AI models by eliminating the need to choose between specialized capabilities—users can now rely on a single model for chat, reasoning, and coding tasks. The configurable reasoning effort feature is particularly noteworthy, allowing dynamic adjustment of inference behavior to balance speed and depth. However, the model's efficiency gains, while impressive, will ultimately depend on real-world deployment experiences and whether the consolidated approach truly eliminates the need for task-specific fine-tuning that developers may have developed with predecessor models.



