Ruobing Xie
- Papers
- 23
Cite
Notes
Only stored in your browser.
Authored papers
23Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation
arXiv 2026
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models
arXiv 2026
Hybrid Policy Distillation for LLMs
arXiv 2026
Autonomy-of-Experts Models
arXiv 2025
LaSeR: Reinforcement Learning with Last-Token Self-Rewarding
arXiv 2025
Multi-Grained Patch Training for Efficient LLM-based Recommendation
arXiv 2025
The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason
arXiv 2025
PhD: A ChatGPT-Prompted Visual hallucination Evaluation Dataset
CVPR 2025 1
DHCP: Detecting Hallucinations by Cross-modal Attention Pattern in Large Vision-Language Models
arXiv 2024
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
arXiv 2024
Content-Based Collaborative Generation for Recommender Systems
arXiv 2024
Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication
arXiv 2024
More Expressive Attention with Negative Weights
arXiv 2024
Continuous Speech Tokenizer in Text To Speech
arXiv 2024
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
arXiv 2024
Advancing LLM Reasoning Generalists with Preference Trees
arXiv 2024
Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment
arXiv 2024
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
arXiv 2023
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
arXiv 2023
MAVEN-Arg: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument Annotation
arXiv 2023
UltraFeedback: Boosting Language Models with High-quality Feedback
ICML
Selective Fairness in Recommendation via Prompts
arXiv 2022
Pruning Pre-trained Language Models Without Fine-Tuning
arXiv 2022
Affiliations
Frequent co-authors
10from 23 papers