Yanghua Xiao
- Papers
- 32
Cite
Notes
Only stored in your browser.
Authored papers
32DIVE: Scaling Diversity in Agentic Task Synthesis for Generalizable Tool Use
arXiv 2026
SEIF: Self-Evolving Reinforcement Learning for Instruction Following
arXiv 2026
GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization (V1.0)
arXiv 2026
ARM: Adaptive Reasoning Model
arXiv 2025
Beyond the Trade-off: Self-Supervised Reinforcement Learning for Reasoning Models' Instruction Following
arXiv 2025
ARIA: Training Language Agents with Intention-Driven Reward Aggregation
arXiv 2025
CoSER: Coordinating LLM-Based Persona Simulation of Established Roles
arXiv 2025
Order Matters: Investigate the Position Bias in Multi-constraint Instruction Following
arXiv 2025
Step-by-Step Mastery: Enhancing Soft Constraint Following Ability of Large Language Models
arXiv 2025
BookWorld: From Novels to Interactive Agent Societies for Creative Story Generation
arXiv 2025
Reward Shaping to Mitigate Reward Hacking in RLHF
arXiv 2025
AdaptiveLog: An Adaptive Log Analysis Framework with the Collaboration of Large and Small Language Model
arXiv 2025
MCiteBench: A Multimodal Benchmark for Generating Text with Citations
arXiv 2025
ToReMi: Topic-Aware Data Reweighting for Dynamic Pre-Training Data Selection
arXiv 2025
A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models
arXiv 2025
Mind the Generation Process: Fine-Grained Confidence Estimation During LLM Generation
arXiv 2025
AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation
arXiv 2024
ESC-Eval: Evaluating Emotion Support Conversations in Large Language Models
arXiv 2024
From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large Language Models
arXiv 2024
Past Meets Present: Creating Historical Analogy with Large Language Models
arXiv 2024
MultiLingPoT: Enhancing Mathematical Reasoning with Multilingual Program Fine-tuning
arXiv 2024
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
arXiv 2024
How Easily do Irrelevant Inputs Skew the Responses of Large Language Models?
arXiv 2024
Revealing the Barriers of Language Agents in Planning
arXiv 2024
QUILL: Quotation Generation Enhancement of Large Language Models
arXiv 2024
BBT-Fin: Comprehensive Construction of Chinese Financial Domain Pre-trained Language Model, Corpus and Benchmark
arXiv 2023
Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation
arXiv 2023
Can Large Language Models Understand Real-World Complex Instructions?
arXiv 2023
Distilling Script Knowledge from Large Language Models for Constrained Language Planning
arXiv 2023
InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews
arXiv 2023
QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search
arXiv 2023
LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification
arXiv 2020
Affiliations
Frequent co-authors
10from 32 papers