0

Weinan Zhang

Papers
26

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
26papers

Authored papers

26

MMSkills: Towards Multimodal Skills for General Visual Agents

arXiv 2026

2026

Anticipate and Learn: Unleashing Idle-Time Compute in Proactive Agents

arXiv 2026

2026

PhotoBench: Beyond Visual Matching Towards Personalized Intent-Driven Photo Retrieval

arXiv 2026

2026

RLinf-VLA: A Unified and Efficient Framework for VLA+RL Training

arXiv 2025

2025

MARFT: Multi-Agent Reinforcement Fine-Tuning

arXiv 2025

2025

ColorAgent: Building A Robust, Personalized, and Interactive OS Agent

arXiv 2025

2025

Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents

arXiv 2025

2025

LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls

arXiv 2025

2025

ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning

arXiv 2025

2025

Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration

arXiv 2025

2025

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

arXiv 2024

2024

HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device Scenarios

arXiv 2024

2024

Hammer: Robust Function-Calling for On-Device Language Models via Function Masking

arXiv 2024

2024

GenSim2: Scaling Robot Data Generation with Multi-modal and Reasoning LLMs

arXiv 2024

2024

Learning an Actionable Discrete Diffusion Policy via Large-Scale Actionless Video Pre-Training

arXiv 2024

2024

TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision

arXiv 2024

2024

ODICE: Revealing the Mystery of Distribution Correction Estimation via Orthogonal-gradient Update

arXiv 2024

2024

A Survey on LLM-based Multi-Agent System: Recent Advances and New Frontiers in Application

arXiv 2024

2024

Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training

arXiv 2023

2023

Bridging the Sim-to-Real Gap from the Information Bottleneck Perspective

arXiv 2023

2023

Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

diffusion-model-is-an-effective-planner-and

2023

CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models

arXiv 2023

2023

GeoGalactica: A Scientific Large Language Model in Geoscience

arXiv 2023

2023

On Realization of Intelligent Decision-Making in the Real World: A Foundation Decision Model Perspective

arXiv 2022

2022

DropNAS: Grouped Operation Dropout for Differentiable Architecture Search

arXiv 2022

2022

NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning

arXiv 2021

2021

Affiliations

No known affiliations.

Frequent co-authors

10

from 26 papers