Xunliang Cai
- Papers
- 41
Cite
Notes
Only stored in your browser.
Authored papers
41EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience
arXiv 2026
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation
arXiv 2026
Self-Distilled Agentic Reinforcement Learning
arXiv 2026
General365: Benchmarking General Reasoning in Large Language Models Across Diverse and Challenging Tasks
arXiv 2026
SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting
arXiv 2026
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning
arXiv 2026
VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions
arXiv 2026
Exploring Reasoning Reward Model for Agents
arXiv 2026
SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization
arXiv 2026
Infinite-World: Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory
arXiv 2026
AJ-Bench: Benchmarking Agent-as-a-Judge for Environment-Aware Evaluation
arXiv 2026
LongCat-Next: Lexicalizing Modalities as Discrete Tokens
arXiv 2026
LongCat-Flash-Thinking-2601 Technical Report
arXiv 2026
LARY: A Latent Action Representation Yielding Benchmark for Generalizable Vision-to-Action Alignment
arXiv 2026
LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning
arXiv 2026
CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs
arXiv 2026
V_{0.5}: Generalist Value Model as a Prior for Sparse RL Rollouts
arXiv 2026
BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search
arXiv 2026
Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
arXiv 2025
OneThinker: All-in-one Reasoning Model for Image and Video
arXiv 2025
EditThinker: Unlocking Iterative Reasoning for Any Image Editor
arXiv 2025
LongCat-Image Technical Report
arXiv 2025
OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation
arXiv 2025
Architecture Decoupling Is Not All You Need For Unified Multimodal Model
arXiv 2025
Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views
arXiv 2025
LongCat-Video Technical Report
arXiv 2025
CodePlot-CoT: Mathematical Visual Reasoning by Thinking with Code-Driven Images
arXiv 2025
AMO-Bench: Large Language Models Still Struggle in High School Math Competitions
arXiv 2025
R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?
arXiv 2025
VitaBench: Benchmarking LLM Agents with Versatile Interactive Tasks in Real-world Applications
arXiv 2025
Do Large Language Models Excel in Complex Logical Reasoning with Formal Language?
arXiv 2025
Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration
arXiv 2025
Making Mathematical Reasoning Adaptive
arXiv 2025
LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning
arXiv 2024
ReMamba: Equip Mamba with Effective Long-Sequence Modeling
arXiv 2024
Multi-Programming Language Sandbox for LLMs
arXiv 2024
How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data
arXiv 2024
Mitigating Tail Narrowing in LLM Self-Improvement via Socratic-Guided Sampling
arXiv 2024
Not All Contexts Are Equal: Teaching LLMs Credibility-aware Generation
arXiv 2024
PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation
arXiv 2024
Large Language Models Meet Open-World Intent Discovery and Recognition: An Evaluation of ChatGPT
arXiv 2023
Affiliations
Frequent co-authors
10from 41 papers