Dong Yu
- Papers
- 59
Cite
Notes
Only stored in your browser.
Authored papers
59Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification
arXiv 2026
Penguin-VL: Exploring the Efficiency Limits of VLM with LLM-based Vision Encoders
arXiv 2026
Locas: Your Models are Principled Initializers of Locally-Supported Parametric Memories
arXiv 2026
Towards Comprehensive Semantic Speech Embeddings for Chinese Dialects
arXiv 2026
Self-Rewarding Vision-Language Model via Reasoning Decomposition
arXiv 2025
LeVo: High-Quality Song Generation with Multi-Preference Alignment
arXiv 2025
Lifelong Learning of Large Language Model based Agents: A Roadmap
arXiv 2025
DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning
arXiv 2025
WebEvolver: Enhancing Web Agent Self-Improvement with Coevolving World Model
arXiv 2025
RePlan: Reasoning-guided Region Planning for Complex Instruction-based Image Editing
arXiv 2025
R-Zero: Self-Evolving Reasoning LLM from Zero Data
arXiv 2025
Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents
arXiv 2025
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
arXiv 2025
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training
arXiv 2025
Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation
arXiv 2025
DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning
arXiv 2025
Don't Throw Away Your Pretrained Model
arXiv 2025
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models
arXiv 2025
MotionEdit: Benchmarking and Learning Motion-Centric Image Editing
arXiv 2025
Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards
arXiv 2025
Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration Pitfalls
arXiv 2025
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
arXiv 2024
RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph
arXiv 2024
Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers
arXiv 2024
Scaling Synthetic Data Creation with 1,000,000,000 Personas
arXiv 2024
EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer
arXiv 2024
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory
arXiv 2024
Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks
arXiv 2024
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis
arXiv 2024
Cognitive Kernel: An Open-source Agent System towards Generalist Autopilots
arXiv 2024
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing
arXiv 2024
MathChat: Benchmarking Mathematical Reasoning and Instruction Following in Multi-Turn Interactions
arXiv 2024
STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
arXiv 2024
DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects
arXiv 2024
When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives
arXiv 2024
LoGU: Long-form Generation with Uncertainty Expressions
arXiv 2024
HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows
arXiv 2024
DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?
arXiv 2024
MatPlotAgent: Method and Evaluation for LLM-Based Agentic Scientific Data Visualization
arXiv 2024
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization
arXiv 2024
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment
arXiv 2024
Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer
arXiv 2024
DOCBENCH: A Benchmark for Evaluating LLM-based Document Reading Systems
arXiv 2024
InFoBench: Evaluating Instruction Following Ability in Large Language Models
arXiv 2024
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning
arXiv 2024
Abstraction-of-Thought Makes Language Models Better Reasoners
arXiv 2024
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search
arXiv 2024
MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning
arXiv 2023
Dense X Retrieval: What Retrieval Granularity Should We Use?
arXiv 2023
LASER: LLM Agent with State-Space Exploration for Web Navigation
arXiv 2023
From Language Modeling to Instruction Following: Understanding the Behavior Shift in LLMs after Instruction Tuning
arXiv 2023
The Trickle-down Impact of Reward (In-)consistency on RLHF
arXiv 2023
Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations
arXiv 2023
TencentLLMEval: A Hierarchical Evaluation of Real-World Capabilities for Human-Aligned LLMs
arXiv 2023
Bridging the Gap between Synthetic and Authentic Images for Multimodal Machine Translation
arXiv 2023
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis
arXiv 2022
OASum: Large-Scale Open Domain Aspect-based Summarization
arXiv 2022
Z-LaVI: Zero-Shot Language Solver Fueled by Visual Imagination
arXiv 2022
FAST-RIR: Fast neural diffuse room impulse response generator
arXiv 2021
Affiliations
Frequent co-authors
10from 59 papers