Yuchen Yan
- Papers
- 23
Cite
Notes
Only stored in your browser.
Authored papers
23KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation
arXiv 2026
Code-A1: Adversarial Evolving of Code LLM and Test LLM via Reinforcement Learning
arXiv 2026
GroundAct: Can LLM Agents Ground Actions in Environmental States?
arXiv 2025
InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning
arXiv 2026
EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering
arXiv 2025
SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation
arXiv 2025
InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models
arXiv 2025
Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning
arXiv 2025
Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
arXiv 2025
GSM8K-V: Can Vision Language Models Solve Grade School Math Word Problems in Visual Contexts
arXiv 2025
Hierarchical Budget Policy Optimization for Adaptive Reasoning
arXiv 2025
Let LLMs Break Free from Overthinking via Self-Braking Tuning
arXiv 2025
Double-Checker: Enhancing Reasoning of Slow-Thinking LLMs via Self-Critical Fine-Tuning
arXiv 2025
Do Large Language Models Excel in Complex Logical Reasoning with Formal Language?
arXiv 2025
Test-Time Reinforcement Learning for GUI Grounding via Region Consistency
arXiv 2025
VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models
arXiv 2025
Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving
arXiv 2025
LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization
arXiv 2025
Cooper: Co-Optimizing Policy and Reward Models in Reinforcement Learning for Large Language Models
arXiv 2025
UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models
arXiv 2025
LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning
arXiv 2024
Triad: A Framework Leveraging a Multi-Role LLM-based Agent to Solve Knowledge Base Question Answering
arXiv 2024
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality
arXiv 2024
Affiliations
Frequent co-authors
10from 23 papers