0

Wenhao Yu

Papers
39

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
39papers

Authored papers

39

Self-Rewarding Vision-Language Model via Reasoning Decomposition

arXiv 2025

2025

MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly

arXiv 2025

2025

WebEvolver: Enhancing Web Agent Self-Improvement with Coevolving World Model

arXiv 2025

2025

MotionEdit: Benchmarking and Learning Motion-Centric Image Editing

arXiv 2025

2025

R-Zero: Self-Evolving Reasoning LLM from Zero Data

arXiv 2025

2025

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

arXiv 2025

2025

Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation

arXiv 2025

2025

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

arXiv 2025

2025

KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality

arXiv 2025

2025

Human2LocoMan: Learning Versatile Quadrupedal Manipulation with Human Pretraining

arXiv 2025

2025

ReCode: Updating Code API Knowledge with Reinforcement Learning

arXiv 2025

2025

Don't Throw Away Your Pretrained Model

arXiv 2025

2025

Towards Trustworthy GUI Agents: A Survey

arXiv 2025

2025

Bidirectional LMs are Better Knowledge Memorizers? A Benchmark for Real-world Knowledge Injection

arXiv 2025

2025

VeriGUI: Verifiable Long-Chain GUI Dataset

arXiv 2025

2025

RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph

arXiv 2024

2024

OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization

arXiv 2024

2024

DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems

arXiv 2024

2024

Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning

arXiv 2024

2024

Cognitive Kernel: An Open-source Agent System towards Generalist Autopilots

arXiv 2024

2024

MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions

arXiv 2024

2024

WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models

arXiv 2024

2024

LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory

arXiv 2024

2024

Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks

arXiv 2024

2024

DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?

arXiv 2024

2024

Large Language Models are Built-in Autoregressive Search Engines

arXiv 2023

2023

LASER: LLM Agent with State-Space Exploration for Web Navigation

arXiv 2023

2023

PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning

arXiv 2023

2023

Constrained Decision Transformer for Offline Safe Reinforcement Learning

arXiv 2023

2023

Dense X Retrieval: What Retrieval Granularity Should We Use?

arXiv 2023

2023

Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations

arXiv 2023

2023

Exploring Contrast Consistency of Open-Domain Question Answering Systems on Minimally Edited Questions

arXiv 2023

2023

Diversifying Content Generation for Commonsense Reasoning with Mixture of Knowledge Graph Experts

NAACL (DLG4NLP) 2022 7

2022

Multi-task Self-supervised Graph Neural Networks Enable Stronger Task Generalization

arXiv 2022

2022

Retrieval Augmentation for Commonsense Reasoning: A Unified Approach

arXiv 2022

2022

A Unified Encoder-Decoder Framework with Entity Memory

arXiv 2022

2022

A Survey of Deep Learning for Mathematical Reasoning

arXiv 2022

2022

Generate rather than Retrieve: Large Language Models are Strong Context Generators

arXiv 2022

2022

A Survey of Knowledge-Enhanced Text Generation

arXiv 2020

2020

Affiliations

No known affiliations.

Frequent co-authors

10

from 39 papers