Dongmei Zhang

SWE-bench Goes Live!

arXiv 2025

VEM: Environment-Free Exploration for Training GUI Agent with Value Environment Model

arXiv 2025

GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents

arXiv 2025

RePrompt: Reasoning-Augmented Reprompting for Text-to-Image Generation via Reinforcement Learning

arXiv 2025

MMTU: A Massive Multi-Task Table Understanding and Reasoning Benchmark

arXiv 2025

Text2Grad: Reinforcement Learning from Natural Language Feedback

arXiv 2025

UFO^3: Weaving the Digital Agent Galaxy

arXiv 2025

UFO: A UI-Focused Agent for Windows OS Interaction

arXiv 2024

LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

arXiv 2024

Large Action Models: From Inception to Implementation

arXiv 2024

Large Language Model-Brained GUI Agents: A Survey

arXiv 2024

EfficientRAG: Efficient Retriever for Multi-Hop Question Answering

arXiv 2024

Call Me When Necessary: LLMs can Efficiently and Faithfully Reason over Structured Environments

arXiv 2024

PromptIntern: Saving Inference Costs by Internalizing Recurrent Prompt during Large Language Model Fine-tuning

arXiv 2024

Auto-Formula: Recommend Formulas in Spreadsheets using Contrastive Learning for Table Representations

arXiv 2024

TaskWeaver: A Code-First Agent Framework

arXiv 2023

Breaking Language Barriers in Multilingual Mathematical Reasoning: Insights and Observations

arXiv 2023

LayoutPrompter: Awaken the Design Ability of Large Language Models

layoutprompter-awaken-the-design-ability-of

Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation

arXiv 2023

Towards Efficient Fine-tuning of Pre-trained Code Models: An Experimental Study and Beyond

arXiv 2023

Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering

arXiv 2023

SoTaNa: The Open-Source Software Development Assistant

arXiv 2023

Is Bigger and Deeper Always Better? Probing LLaMA Across Scales and Layers

arXiv 2023

Table Meets LLM: Can Large Language Models Understand Structured Table Data? A Benchmark and Empirical Study

arXiv 2023

Conservative State Value Estimation for Offline Reinforcement Learning

conservative-state-value-estimation-for