Cohere Launches North Mini Code 1.0: Open-Source Agentic Coding Model with NVFP4 Acceleration
Key Takeaways
- ▸North Mini Code 1.0 is Cohere's first open-source agentic coding model, released under Apache 2.0 for sovereign deployment
- ▸NVFP4 quantization delivers 1.65x throughput gains and 40% memory reduction while maintaining identical quality on HumanEval benchmarks
- ▸The model includes tool calling and reasoning capabilities and runs efficiently on a single Spark Arena instance under vLLM
Summary
Cohere has released North Mini Code 1.0, an open-source agentic coding model built for sovereign, self-hosted deployments. The 30B MoE model (3 billion active parameters) is licensed under Apache 2.0 and includes native tool calling and reasoning capabilities for code generation and analysis tasks.
Using NVIDIA's NVFP4 (4-bit precision) quantization, the model demonstrates significant efficiency gains on NVIDIA Spark Arena hardware. Benchmarks show approximately 1.65x faster token throughput (52 tokens/second vs 32 on FP8), a 40% reduction in memory footprint (17GB vs 28GB), and identical quality metrics on HumanEval—with zero measurable loss in performance. The model runs efficiently on a single Spark instance under vLLM with FP8 key-value cache.
This release positions Cohere in the competitive open-source agentic coding space, offering developers a practical, resource-efficient alternative that doesn't require cloud infrastructure. Both FP8 reference and NVFP4 quantized versions are available on Spark Arena for benchmarking and reproduction, with full recipes and configuration logs provided for transparency.
- Both FP8 and NVFP4 implementations are publicly available with full benchmarks and quantization recipes for reproducibility
Editorial Opinion
Cohere's North Mini Code represents a meaningful step toward practical agentic AI for developers who need sovereign control over their infrastructure. The combination of open-source licensing, aggressive NVFP4 quantization that maintains performance parity, and zero measurable quality loss directly addresses the resource constraints that make self-hosted coding agents impractical for many teams. If the quality profile holds up beyond HumanEval on real-world coding tasks, this could become a benchmark baseline for open-source agentic development. The transparency around benchmarking and recipe availability is commendable and should help drive community adoption and reproducibility.



