0

Wenqi Zhang

Papers
22

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
22papers

Authored papers

22

GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification

arXiv 2026

2026

KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation

arXiv 2026

2026

GroundAct: Can LLM Agents Ground Actions in Environmental States?

arXiv 2025

2026

UI-Copilot: Advancing Long-Horizon GUI Automation via Tool-Integrated Policy Optimization

arXiv 2026

2026

EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering

arXiv 2025

2025

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

arXiv 2025

2025

RG-Attn: Radian Glue Attention for Multi-modality Multi-agent Cooperative Perception

arXiv 2025

2025

GSM8K-V: Can Vision Language Models Solve Grade School Math Word Problems in Visual Contexts

arXiv 2025

2025

GUI-G$^2$: Gaussian Reward Modeling for GUI Grounding

arXiv 2025

2025

Let LLMs Break Free from Overthinking via Self-Braking Tuning

arXiv 2025

2025

Embodied-Reasoner: Synergizing Visual Search, Reasoning, and Action for Embodied Interactive Tasks

arXiv 2025

2025

Cooper: Co-Optimizing Policy and Reward Models in Reinforcement Learning for Large Language Models

arXiv 2025

2025

TimeHC-RL: Temporal-aware Hierarchical Cognitive Reinforcement Learning for Enhancing LLMs' Social Intelligence

arXiv 2025

2025

SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation

arXiv 2025

2025

Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning

arXiv 2025

2025

Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency

arXiv 2025

2025

Think Twice, Click Once: Enhancing GUI Grounding via Fast and Slow Systems

arXiv 2025

2025

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

arXiv 2024

2024

Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization

arXiv 2024

2024

Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model

arXiv 2024

2024

Entering Real Social World! Benchmarking the Social Intelligence of Large Language Models from a First-person Perspective

arXiv 2024

2024

Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow

arXiv 2023

2023

Affiliations

No known affiliations.

Frequent co-authors

10

from 22 papers