We introduce the latest series of TeleChat models: \textbf{TeleChat2}, \textbf{TeleChat2.5}, and \textbf{T1}, offering a significant upgrade over their predecessor, TeleChat. Despite minimal changes to the model architecture, the new series achieves substantial performance gains through enhanced training strategies in both pre-training and post-training stages. The series begins with \textbf{TeleChat2}, which undergoes pretraining on 10 trillion high-quality and diverse tokens. This is followed by Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) to further enhance its capabilities. \textbf{TeleChat2.5} and \textbf{T1} expand the pipeline by incorporating a continual pretraining phase with domain-specific datasets, combined with reinforcement learning (RL) to improve performance in code generation and mathematical reasoning tasks. The \textbf{T1} variant is designed for complex reasoning, supporting long Chain-of-Thought (CoT) reasoning and demonstrating substantial improvements in mathematics and coding. In contrast, \textbf{TeleChat2.5} prioritizes speed, delivering rapid inference. Both flagship models of \textbf{T1} and \textbf{TeleChat2.5} are dense Transformer-based architectures with 115B parameters, showcasing significant advancements in reasoning and general task performance compared to the original TeleChat. Notably, \textbf{T1-115B} outperform proprietary models such as OpenAI's o1-mini and GPT-4o. We publicly release \textbf{TeleChat2}, \textbf{TeleChat2.5} and \textbf{T1}, including post-trained versions with 35B and 115B parameters, to empower developers and researchers with state-of-the-art language models tailored for diverse applications.
Technical Report of TeleChat2, TeleChat2.5 and T1
The TeleChat2, TeleChat2.5, and T1 models achieve performance improvements through enhanced training strategies, including Supervised Fine-Tuning, Direct Preference Optimization, and reinforcement learning, with T1 focusing on complex reasoning and TeleChat2.5 on speed.
- Year
- 2025
- Venue
- arXiv 2025
- Authors
- 38
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2507.18013v3ARXIV-DEFAULT
- TL;DR
- Semantic Scholar
Abstract
Authors
38Zihan WangXin WangXuelong LiYu ZhaoZhaohu XingChao WangZhuo JiangYuhan SunZhihao YangKaidong YuXinzhang LiuYitong YaoWenmin DengKaipeng JiaJiaxin PengYuyao HuangSishi XiongXiaohui HuFubei YaoRuiyu FangZhuoru JiangRuiting SongQiyi XieRui XueXuewei HeYanlei XueZhu YuanZhaoxi ZhangZilu HuangShiquan WangHanming WuMingyuan WangXufeng ZhanYuhao JiangBingkai YangShuangyong SongYongxiang LiZhongjiang He