0

Bowen Zhou

Tsinghua professor and director of the Shanghai AI Lab; previously VP of AI at JD.com and IBM Watson, focused on trustworthy LLMs.

Role
professor
Papers
45

Cite

Notes

Only stored in your browser.

45papers·1tool contribs

Authored papers

45

Post-Trained MoE Can Skip Half Experts via Self-Distillation

arXiv 2026

2026

InternVLA-A1: Unifying Understanding, Generation and Action for Robotic Manipulation

arXiv 2026

2026

InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery

arXiv 2026

2026

Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization

arXiv 2026

2026

P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads

arXiv 2026

2026

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

arXiv 2025

2025

A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers

arXiv 2025

2025

SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

arXiv 2025

2025

TTRL: Test-Time Reinforcement Learning

arXiv 2025

2025

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

arXiv 2025

2025

NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification

arXiv 2025

2025

Process Reinforcement through Implicit Rewards

arXiv 2025

2025

MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding

arXiv 2025

2025

Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback

arXiv 2025

2025

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

arXiv 2025

2025

SSRL: Self-Search Reinforcement Learning

arXiv 2025

2025

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

arXiv 2025

2025

FlowRL: Matching Reward Distributions for LLM Reasoning

arXiv 2025

2025

A Survey of Reinforcement Learning for Large Reasoning Models

arXiv 2025

2025

P1: Mastering Physics Olympiads with Reinforcement Learning

arXiv 2025

2025

InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy

arXiv 2025

2025

Towards a Unified View of Large Language Model Post-Training

arXiv 2025

2025

From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery

arXiv 2025

2025

Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny

arXiv 2025

2025

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

arXiv 2025

2025

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

arXiv 2025

2025

Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

arXiv 2025

2025

GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning

arXiv 2025

2025

Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices

arXiv 2024

2024

Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process

arXiv 2024

2024

Advancing LLM Reasoning Generalists with Preference Trees

arXiv 2024

2024

UltraMedical: Building Specialized Generalists in Biomedicine

arXiv 2024

2024

Free Process Rewards without Process Labels

arXiv 2024

2024

Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System

arXiv 2024

2024

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

arXiv 2024

2024

LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion

CVPR 2024 1

2024

MAG-SQL: Multi-Agent Generative Approach with Soft Schema Linking and Iterative Sub-SQL Refinement for Text-to-SQL

arXiv 2024

2024

Large Language Models as Biomedical Hypothesis Generators: A Comprehensive Evaluation

arXiv 2024

2024

Fast and Slow Generating: An Empirical Study on Large and Small Language Models Collaborative Decoding

arXiv 2024

2024

How to Synthesize Text Data without Model Collapse?

arXiv 2024

2024

Enhancing Chat Language Models by Scaling High-quality Instructional Conversations

EMNLP

2023

Sparse Low-rank Adaptation of Pre-trained Language Models

arXiv 2023

2023

PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning

arXiv 2023

2023

CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model

arXiv 2023

2023

LMD: Faster Image Reconstruction with Latent Masking Diffusion

arXiv 2023

2023

Tool contributions

1

Affiliations

Frequent co-authors

10

from 45 papers