Wanli Ouyang
- Papers
- 69
Cite
Notes
Only stored in your browser.
Authored papers
69Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models
arXiv 2026
StraTA: Incentivizing Agentic Reinforcement Learning with Strategic Trajectory Abstraction
arXiv 2026
A Survey of Scientific Large Language Models: From Data Foundations to Agent Frontiers
arXiv 2025
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
arXiv 2025
Flow-GRPO: Training Flow Matching Models via Online RL
arXiv 2025
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models
arXiv 2025
CPRet: A Dataset, Benchmark, and Model for Retrieval in Competitive Programming
arXiv 2025
The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants
arXiv 2025
Native-Resolution Image Synthesis
arXiv 2025
Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback
arXiv 2025
ChemMLLM: Chemical Multimodal Large Language Model
arXiv 2025
AlignRAG: Leveraging Critique Learning for Evidence-Sensitive Retrieval-Augmented Reasoning
arXiv 2025
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency
arXiv 2025
CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics
arXiv 2025
Transition Models: Rethinking the Generative Learning Objective
arXiv 2025
Interleaving Reasoning for Better Text-to-Image Generation
arXiv 2025
Cache-to-Cache: Direct Semantic Communication Between Large Language Models
arXiv 2025
Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation
arXiv 2025
Cut2Next: Generating Next Shot via In-Context Tuning
arXiv 2025
ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models
arXiv 2025
Agentic Jigsaw Interaction Learning for Enhancing Visual Perception and Reasoning in Vision-Language Models
arXiv 2025
From AI for Science to Agentic Science: A Survey on Autonomous Scientific Discovery
arXiv 2025
CoMAS: Co-Evolving Multi-Agent Systems via Interaction Rewards
arXiv 2025
SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines
arXiv 2025
PRING: Rethinking Protein-Protein Interaction Prediction from Pairs to Graphs
arXiv 2025
Chem-R: Learning to Reason as a Chemist
arXiv 2025
Fitness aligned structural modeling enables scalable virtual screening with AuroBind
arXiv 2025
TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models
arXiv 2025
SparseFlex: High-Resolution and Arbitrary-Topology 3D Shape Modeling
ICCV 2025
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
arXiv 2025
Nature-Inspired Population-Based Evolution of Large Language Models
arXiv 2025
AutoMat: Enabling Automated Crystal Structure Reconstruction from Microscopy via Agentic Tool Use
arXiv 2025
Do We Truly Need So Many Samples? Multi-LLM Repeated Sampling Efficiently Scales Test-Time Compute
arXiv 2025
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows
arXiv 2025
MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback
arXiv 2025
OASIS: Open Agent Social Interaction Simulations with One Million Agents
arXiv 2024
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT
arXiv 2024
A Comprehensive Survey on 3D Content Generation
arXiv 2024
PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines
arXiv 2024
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series
arXiv 2024
Neuro-3D: Towards 3D Visual Decoding from EEG Signals
CVPR 2025 1
ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area
arXiv 2024
Depth Any Video with Scalable Synthetic Data
arXiv 2024
Accessing GPT-4 level Mathematical Olympiad Solutions via Monte Carlo Tree Self-refine with LLaMa-3 8B
arXiv 2024
Bilateral Reference for High-Resolution Dichotomous Image Segmentation
arXiv 2024
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model
arXiv 2024
Biology Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language Models
arXiv 2024
Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
CVPR 2025 1
ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems
CVPR 2025 1
Dense Connector for MLLMs
arXiv 2024
Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System
arXiv 2024
BEACON: Benchmark for Comprehensive RNA Tasks and Language Models
arXiv 2024
MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses
arXiv 2024
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
arXiv 2024
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
arXiv 2023
DiffBIR: Towards Blind Image Restoration with Generative Diffusion Prior
arXiv 2023
RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models
arXiv 2023
Meta-Transformer: A Unified Framework for Multimodal Learning
arXiv 2023
UniPAD: A Universal Pre-training Paradigm for Autonomous Driving
CVPR 2024 1
Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization
arXiv 2023
UniHCP: A Unified Model for Human-Centric Perceptions
CVPR 2023 1
What Can Simple Arithmetic Operations Do for Temporal Modeling?
ICCV 2023 1
Masked Motion Predictors are Strong 3D Action Representation Learners
ICCV 2023 1
Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object Detection
bi-lrfusion-bi-directional-lidar-radar-fusion
NDC-Scene: Boost Monocular 3D Semantic Scene Completion in Normalized Device Coordinates Space
ICCV 2023 1
CLIP2Point: Transfer CLIP to Point Cloud Classification with Image-Depth Pre-training
ICCV 2023 1
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
supervision-exists-everywhere-a-data
DETR for Crowd Pedestrian Detection
arXiv 2020
Learning 3D Human Shape and Pose from Dense Body Parts
arXiv 2019
Affiliations
Frequent co-authors
10from 69 papers