Qwen3-Coder
Open weightsby Alibaba Cloud·China·Released
Open-weights coding specialist — 480B MoE, agentic by design.
About this model
Qwen3-Coder (July 2025) is the largest open-weights coding-specialist model — a 480B-parameter Mixture-of-Experts with 35B activated per token, built on the Qwen3 backbone and post-trained heavily for agentic coding workflows. It ships with native support for the Qwen Code CLI, Claude Code style usage, and standard chat APIs.
On SWE-bench Verified, Qwen3-Coder scored 67% at launch — the highest open-weights score on the benchmark at the time of writing, ahead of Kimi K2's 65.8% and within striking distance of Claude Sonnet 4. The 256K-token context (1M with YaRN scaling) makes whole-codebase analysis practical, and the Apache 2.0 license means no commercial restrictions.
Strengths
- •Highest open-weights SWE-bench Verified score (67%)
- •256K context (1M via YaRN) for whole-codebase reasoning
- •Apache 2.0 — no MAU caps or commercial restrictions
- •Native agentic post-training — not just code completion
Limitations
- •480B MoE requires substantial serving infrastructure
- •Smaller distilled variants below the 35B-active flagship are less competitive
- •Less mature IDE plugin ecosystem than Cursor/Continue + Claude
When to use it
- →Self-hosted coding agents on private infrastructure
- →Open alternatives to Claude Code / Cursor agent mode
- →Whole-repo refactors with the 1M-YaRN context
- →Research on agentic post-training methodology
Architecture & training
Built on the Qwen3 backbone, then post-trained on agentic coding traces — Alibaba's blog post emphasises long-horizon RL over multi-turn coding sessions rather than just supervised next-token prediction on code. The 480B/35B-active config matches the param-efficiency targets DeepSeek introduced with V3.
Benchmarks
| Benchmark | Score | Bar |
|---|---|---|
| HumanEval | 92.5 | |
| SWE-bench Verified | 67.0 |