Wenhao Yu
- Papers
- 39
Cite
Notes
Only stored in your browser.
Authored papers
39Self-Rewarding Vision-Language Model via Reasoning Decomposition
arXiv 2025
MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly
arXiv 2025
WebEvolver: Enhancing Web Agent Self-Improvement with Coevolving World Model
arXiv 2025
MotionEdit: Benchmarking and Learning Motion-Centric Image Editing
arXiv 2025
R-Zero: Self-Evolving Reasoning LLM from Zero Data
arXiv 2025
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
arXiv 2025
Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation
arXiv 2025
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution
arXiv 2025
KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality
arXiv 2025
Human2LocoMan: Learning Versatile Quadrupedal Manipulation with Human Pretraining
arXiv 2025
ReCode: Updating Code API Knowledge with Reinforcement Learning
arXiv 2025
Don't Throw Away Your Pretrained Model
arXiv 2025
Towards Trustworthy GUI Agents: A Survey
arXiv 2025
Bidirectional LMs are Better Knowledge Memorizers? A Benchmark for Real-world Knowledge Injection
arXiv 2025
VeriGUI: Verifiable Long-Chain GUI Dataset
arXiv 2025
RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph
arXiv 2024
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization
arXiv 2024
DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems
arXiv 2024
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning
arXiv 2024
Cognitive Kernel: An Open-source Agent System towards Generalist Autopilots
arXiv 2024
MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions
arXiv 2024
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
arXiv 2024
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory
arXiv 2024
Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks
arXiv 2024
DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?
arXiv 2024
Large Language Models are Built-in Autoregressive Search Engines
arXiv 2023
LASER: LLM Agent with State-Space Exploration for Web Navigation
arXiv 2023
PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning
arXiv 2023
Constrained Decision Transformer for Offline Safe Reinforcement Learning
arXiv 2023
Dense X Retrieval: What Retrieval Granularity Should We Use?
arXiv 2023
Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations
arXiv 2023
Exploring Contrast Consistency of Open-Domain Question Answering Systems on Minimally Edited Questions
arXiv 2023
Diversifying Content Generation for Commonsense Reasoning with Mixture of Knowledge Graph Experts
NAACL (DLG4NLP) 2022 7
Multi-task Self-supervised Graph Neural Networks Enable Stronger Task Generalization
arXiv 2022
Retrieval Augmentation for Commonsense Reasoning: A Unified Approach
arXiv 2022
A Unified Encoder-Decoder Framework with Entity Memory
arXiv 2022
A Survey of Deep Learning for Mathematical Reasoning
arXiv 2022
Generate rather than Retrieve: Large Language Models are Strong Context Generators
arXiv 2022
A Survey of Knowledge-Enhanced Text Generation
arXiv 2020
Affiliations
Frequent co-authors
10from 39 papers