Mistral Announces Small 4: Unified Multimodal Model Combining Reasoning, Coding, and Chat Capabilities

Key Takeaways

▸Mistral Small 4 unifies reasoning, multimodal, and coding capabilities into a single open-source model with 119B parameters and configurable reasoning effort
▸The model achieves 40% lower latency and 3x higher throughput than Small 3, while matching or exceeding GPT-OSS 120B performance with more concise outputs
▸Native multimodality support (text and image inputs) and a 256k context window enable versatile applications from document analysis to complex reasoning

Source:

Hacker Newshttps://mistral.ai/news/mistral-small-4↗

Summary

Mistral has announced Mistral Small 4, a major release that consolidates the capabilities of three specialized models—Magistral (reasoning), Pixtral (multimodal), and Devstral (coding)—into a single, versatile model. The 119B-parameter hybrid model features a Mixture of Experts architecture with 128 experts and 4 active per token, delivering 6B active parameters and a 256k context window. Released under the Apache 2.0 license, Small 4 introduces configurable reasoning effort, allowing users to toggle between fast, low-latency responses and deep, reasoning-intensive outputs depending on task requirements.

The model demonstrates significant performance improvements over its predecessor, with a 40% reduction in end-to-end completion time and 3x more requests per second in throughput-optimized setups. Small 4 supports both text and image inputs natively, making it suitable for document parsing, visual analysis, chat, coding, and complex reasoning tasks. The model achieves competitive performance on benchmarks, matching or surpassing GPT-OSS 120B while generating significantly shorter outputs—a key indicator of efficiency. Mistral is joining NVIDIA's Nemotron Coalition as a founding member and has optimized the model for deployment on platforms including vLLM, llama.cpp, SGLang, and Transformers.

Released under Apache 2.0 license with optimizations for NVIDIA hardware and integration across major inference frameworks

Editorial Opinion

Mistral Small 4 represents a meaningful evolution in open-source AI models by eliminating the need to choose between specialized capabilities—users can now rely on a single model for chat, reasoning, and coding tasks. The configurable reasoning effort feature is particularly noteworthy, allowing dynamic adjustment of inference behavior to balance speed and depth. However, the model's efficiency gains, while impressive, will ultimately depend on real-world deployment experiences and whether the consolidated approach truly eliminates the need for task-specific fine-tuning that developers may have developed with predecessor models.

Mistral Announces Small 4: Unified Multimodal Model Combining Reasoning, Coding, and Chat Capabilities

Key Takeaways

▸Mistral Small 4 unifies reasoning, multimodal, and coding capabilities into a single open-source model with 119B parameters and configurable reasoning effort
▸The model achieves 40% lower latency and 3x higher throughput than Small 3, while matching or exceeding GPT-OSS 120B performance with more concise outputs
▸Native multimodality support (text and image inputs) and a 256k context window enable versatile applications from document analysis to complex reasoning

Summary

Released under Apache 2.0 license with optimizations for NVIDIA hardware and integration across major inference frameworks

Editorial Opinion

Mistral Small 4 represents a meaningful evolution in open-source AI models by eliminating the need to choose between specialized capabilities—users can now rely on a single model for chat, reasoning, and coding tasks. The configurable reasoning effort feature is particularly noteworthy, allowing dynamic adjustment of inference behavior to balance speed and depth. However, the model's efficiency gains, while impressive, will ultimately depend on real-world deployment experiences and whether the consolidated approach truly eliminates the need for task-specific fine-tuning that developers may have developed with predecessor models.

Mistral Announces Small 4: Unified Multimodal Model Combining Reasoning, Coding, and Chat Capabilities

Key Takeaways

Summary

Editorial Opinion

More from Mistral AI

Supply Chain Attack: Mistral AI's Python Package Compromised With Linux Backdoor

Major Supply Chain Attack Compromises Mistral AI SDK and 170+ Open Source Packages

Mini Shai-Hulud Worm Compromises 160+ npm Packages, Including Mistral

Comments

Suggested

Google DeepMind Launches Gemini 3.5 Flash: New Lightweight AI Model

SID Achieves Search Breakthrough with SID-1, Outperforming GPT-5 at 1k+ QPS Using Reinforcement Learning

Advanced AI Models Bring Government to 'Reflection Point,' CIA Official Says

Mistral Announces Small 4: Unified Multimodal Model Combining Reasoning, Coding, and Chat Capabilities

Key Takeaways

Summary

Editorial Opinion

More from Mistral AI

Supply Chain Attack: Mistral AI's Python Package Compromised With Linux Backdoor

Major Supply Chain Attack Compromises Mistral AI SDK and 170+ Open Source Packages

Mini Shai-Hulud Worm Compromises 160+ npm Packages, Including Mistral

Comments

Suggested

Google DeepMind Launches Gemini 3.5 Flash: New Lightweight AI Model

SID Achieves Search Breakthrough with SID-1, Outperforming GPT-5 at 1k+ QPS Using Reinforcement Learning

Advanced AI Models Bring Government to 'Reflection Point,' CIA Official Says