Weinan Zhang
- Papers
- 26
Cite
Notes
Only stored in your browser.
Authored papers
26MMSkills: Towards Multimodal Skills for General Visual Agents
arXiv 2026
Anticipate and Learn: Unleashing Idle-Time Compute in Proactive Agents
arXiv 2026
PhotoBench: Beyond Visual Matching Towards Personalized Intent-Driven Photo Retrieval
arXiv 2026
RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training
arXiv 2025
MARFT: Multi-Agent Reinforcement Fine-Tuning
arXiv 2025
ColorAgent: Building A Robust, Personalized, and Interactive OS Agent
arXiv 2025
Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents
arXiv 2025
LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls
arXiv 2025
ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning
arXiv 2025
Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration
arXiv 2025
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
arXiv 2024
HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device Scenarios
arXiv 2024
Hammer: Robust Function-Calling for On-Device Language Models via Function Masking
arXiv 2024
GenSim2: Scaling Robot Data Generation with Multi-modal and Reasoning LLMs
arXiv 2024
Learning an Actionable Discrete Diffusion Policy via Large-Scale Actionless Video Pre-Training
arXiv 2024
TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision
arXiv 2024
ODICE: Revealing the Mystery of Distribution Correction Estimation via Orthogonal-gradient Update
arXiv 2024
A Survey on LLM-based Multi-Agent System: Recent Advances and New Frontiers in Application
arXiv 2024
Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training
arXiv 2023
Bridging the Sim-to-Real Gap from the Information Bottleneck Perspective
arXiv 2023
Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning
diffusion-model-is-an-effective-planner-and
CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models
arXiv 2023
GeoGalactica: A Scientific Large Language Model in Geoscience
arXiv 2023
On Realization of Intelligent Decision-Making in the Real World: A Foundation Decision Model Perspective
arXiv 2022
DropNAS: Grouped Operation Dropout for Differentiable Architecture Search
arXiv 2022
NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning
arXiv 2021
Affiliations
Frequent co-authors
10from 26 papers