Junyang Lin, Jingren Zhou, Hangrui Hu et al. · 22 Jan 2026
In this report, we present the Qwen3-TTS series, a family of advanced multilingual, controllable, robust, and streaming text-to-speech models.
Trending research and the full catalog - each paper linked to the benchmarks, methods, and models it introduces.
Junyang Lin, Jingren Zhou, Hangrui Hu et al. · 22 Jan 2026
In this report, we present the Qwen3-TTS series, a family of advanced multilingual, controllable, robust, and streaming text-to-speech models.
Yitian Gong, Botian Jiang, Yiwei Zhao et al. · 18 Mar 2026
This technical report presents MOSS-TTS, a speech generation foundation model built on a scalable recipe: discrete audio tokens, autoregressive modeling, and large-scale pretraining.
5 Jun 2026
We present dots.tts, a 2B-parameter continuous autoregressive text-to-speech (TTS) foundation model that models speech in a continuous latent space. Compared with existing continuous autoregressive models, our key innovations are threefold.