North Mini Code (Cohere)
Cohere’s first agentic coding model and the inaugural member of its next-generation North family, released 9 June 2026 under Apache 2.0 — an open-weight model, not a closed sovereign-only product. (cohere‘s own announcement + the Cohere Labs model card on Hugging Face.)
Architecture
A Mixture-of-Experts model: 30B total parameters, 3B active (128 experts, 8 activated per token). Interleaved sliding-window and full self-attention in a 3:1 ratio; SwiGLU FFN blocks with a sigmoid router. 256K context, 64K max generation. Minimum hardware is a single H100 at FP8 — small enough to run locally, which is how it serves the sovereignty pitch: own the weights and run them on-premise rather than route data to a shared cloud.
Benchmarks (vendor-reported)
- Artificial Analysis Coding Index: 33.4 — notable because artificial-analysis is the independent platform the spoke uses as its neutral yardstick, so this number is checkable.
- SWE-Bench Verified: 83.2% pass@1 (final; 80.2% pass@10 after SFT).
- Terminal-Bench v2: 62.8% pass@1 (final).
- Mini-SWE-Agent: 61.0% pass@1; human eval 66.1% win rate on code-editing tasks.
- Throughput: claimed up to 2.8× higher output throughput than Devstral Small 2 and a 30% inter-token-latency advantage. Cohere claims it outperforms several much larger models (Nemotron 3 Super 120B, Mistral Small 4 119B, Devstral 2 123B) on coding benchmarks — the 3B-active MoE efficiency story, the same sparse-MoE lever open weights use to reach frontier capability at practical inference cost.
Trained on 70%/61% code tokens (two SFT stages) over 70k+ verifiable tasks across ~5k repositories — positioned for code generation, agentic software engineering, and terminal tasks (agent orchestration, architecture mapping, code reviews).
Availability
Hugging Face (BF16 + FP8 quantized weights), the Cohere API, Model Vault (managed inference), OpenRouter, and OpenCode.
Why it matters
This revises the read on cohere: the lab is no longer only a closed-deployment sovereign player. North Mini Code is Apache-2.0 open-weight, so Cohere now achieves sovereignty through open weights and local deployment — putting it on the open-weight axis beside qwen, deepseek, and mistral-ai, not opposite it. The enterprise/sovereign positioning is the go-to-market, not the licensing.
Tier note (T3 → upgraded from a T4 stub): earlier ingested as a JS-gated The New Stack headline (trade press, body unrecoverable); now rebuilt from Cohere’s own blog + the Cohere Labs Hugging Face model card. Vendor-primary (self-interested on benchmark selection), but the specs are inspectable in the open weights and the Artificial Analysis score is independently defined.
Related: cohere · llm-provider · open-weight-models · artificial-analysis · qwen