We present MiMo-V2-Flash, a Mixture-of-Experts (MoE) model with 309B total parameters and 15B active parameters, designed for fast, strong reasoning and agentic capabilities. MiMo-V2-Flash adopts a hybrid attention architecture that interleaves Sliding Window Attention (SWA) with global attention, with a 128-token sliding window under a 5:1 hybrid ratio. The model is pre-trained on 27 trillion tokens with Multi-Token Prediction (MTP), employing a native 32k context length and subsequently extended to 256k. To efficiently scale post-training compute, MiMo-V2-Flash introduces a novel Multi-Teacher On-Policy Distillation (MOPD) paradigm. In this framework, domain-specialized teachers (e.g., trained via large-scale reinforcement learning) provide dense and token-level reward, enabling the student model to perfectly master teacher expertise. MiMo-V2-Flash rivals top-tier open-weight models such as DeepSeek-V3.2 and Kimi-K2, despite using only 1/2 and 1/3 of their total parameters, respectively. During inference, by repurposing MTP as a draft model for speculative decoding, MiMo-V2-Flash achieves up to 3.6 acceptance length and 2.6x decoding speedup with three MTP layers. We open-source both the model weights and the three-layer MTP weights to foster open research and community collaboration.
MiMo-V2-Flash Technical Report
We present MiMo-V2-Flash, a Mixture-of-Experts (MoE) model with 309B total parameters and 15B active parameters, designed for fast, strong reasoning and agentic capabilities.
- Year
- 2026
- Venue
- arXiv 2026
- Authors
- 125
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2601.02780ARXIV-DEFAULT
- TL;DR
- Semantic Scholar
Abstract
Authors
125Dong ZhangHao WuShuo LiuGang WangFuli LuoLiang ZhaoZhipeng XuBingquan XiaBowen ShenDawei ZhuHailin ZhangHuaqiu LiuJiebao XiaoJinhao DongPeidian LiShihua YuShimao ChenWeikun WangWenhan MaXiangwei DengYiFan SongZihan JiangBowen YeCan CaiChenhong HeDuo ZhangGuoan WangHao TianHeng QuHongshen XuJun ShiKainan BaoKang ZhouLei LIMenghang ZhuNuo ChenShaohui LiuShicheng LiShuhao GuShuhuai RenSirui DengWeiji ZhuangWenyu YangXin ZhangXing YongXing ZhangXu WangYihan YanYu TuYuanyuan TianYudong WangYue YuZhenru LinZhichao SongZihao YueShaolei ZhangZhiyang ChenBangjun XiaoBo YangBofei GaoChen ZhangChiheng LouGang XieHanglong LvHanyu LiHeyu ChenHoubin ZhangJiangshan DuoJianyu WeiJunhao HuLinghao ZhangQianli ChenShijie CaoShouqiu YuTianling ZhouWeijiang SuBohan MaoChenghua WangChengxuan ZhuChong MaChun ChenChunan LiDeshan XiaoFangyue LiuFeiyu YangFengyuan ShiHongfei YiHongxu AnHongyi GuanYihao ZhaoYingchun LaiYizhao GaoYu ChengZhen TangZhengju TangZhengtao WenZhixian ZhengJian WenJiarui SunJiawei LiJinlong XueJun XiaKai FangQian TuQihao ZhangQiying WangRang LiRui MaShengfan WangTao GuoTianyang LuWeikang ZhangWeimin XiongWenshan HuangXueyang XieYilin JiangYixin YangYongzhe HeYuanliang DongYuchen LiuYue MaYuxing XiangZhaojun HuangZhonghua DengZihan Zhang