Bowen Zhou
Tsinghua professor and director of the Shanghai AI Lab; previously VP of AI at JD.com and IBM Watson, focused on trustworthy LLMs.
- Role
- professor
- Currently at
- Tsinghua University
- Scholar
- scholar.google.com/citations
- Papers
- 45
Cite
Notes
Only stored in your browser.
Authored papers
45Post-Trained MoE Can Skip Half Experts via Self-Distillation
arXiv 2026
InternVLA-A1: Unifying Understanding, Generation and Action for Robotic Manipulation
arXiv 2026
InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery
arXiv 2026
Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization
arXiv 2026
P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads
arXiv 2026
MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing
arXiv 2025
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers
arXiv 2025
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning
arXiv 2025
TTRL: Test-Time Reinforcement Learning
arXiv 2025
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
arXiv 2025
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification
arXiv 2025
Process Reinforcement through Implicit Rewards
arXiv 2025
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding
arXiv 2025
Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback
arXiv 2025
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
arXiv 2025
SSRL: Self-Search Reinforcement Learning
arXiv 2025
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
arXiv 2025
FlowRL: Matching Reward Distributions for LLM Reasoning
arXiv 2025
A Survey of Reinforcement Learning for Large Reasoning Models
arXiv 2025
P1: Mastering Physics Olympiads with Reinforcement Learning
arXiv 2025
InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
arXiv 2025
Towards a Unified View of Large Language Model Post-Training
arXiv 2025
From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery
arXiv 2025
Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny
arXiv 2025
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows
arXiv 2025
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
arXiv 2025
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
arXiv 2025
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning
arXiv 2025
Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices
arXiv 2024
Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process
arXiv 2024
Advancing LLM Reasoning Generalists with Preference Trees
arXiv 2024
UltraMedical: Building Specialized Generalists in Biomedicine
arXiv 2024
Free Process Rewards without Process Labels
arXiv 2024
Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System
arXiv 2024
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization
arXiv 2024
LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion
CVPR 2024 1
MAG-SQL: Multi-Agent Generative Approach with Soft Schema Linking and Iterative Sub-SQL Refinement for Text-to-SQL
arXiv 2024
Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation
arXiv 2024
Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding
arXiv 2024
How to Synthesize Text Data without Model Collapse?
arXiv 2024
Enhancing Chat Language Models by Scaling High-quality Instructional Conversations
EMNLP
Sparse Low-rank Adaptation of Pre-trained Language Models
arXiv 2023
PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning
arXiv 2023
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model
arXiv 2023
LMD: Faster Image Reconstruction with Latent Masking Diffusion
arXiv 2023
Tool contributions
1Affiliations
Frequent co-authors
10from 45 papers