0

Ruobing Xie

Papers
23

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile
Semantic Scholar
Attribution policy →
23papers

Authored papers

23

Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation

arXiv 2026

2026

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models

arXiv 2026

2026

Hybrid Policy Distillation for LLMs

arXiv 2026

2026

Autonomy-of-Experts Models

arXiv 2025

2025

LaSeR: Reinforcement Learning with Last-Token Self-Rewarding

arXiv 2025

2025

Multi-Grained Patch Training for Efficient LLM-based Recommendation

arXiv 2025

2025

The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason

arXiv 2025

2025

PhD: A ChatGPT-Prompted Visual hallucination Evaluation Dataset

CVPR 2025 1

2024

DHCP: Detecting Hallucinations by Cross-modal Attention Pattern in Large Vision-Language Models

arXiv 2024

2024

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

arXiv 2024

2024

Content-Based Collaborative Generation for Recommender Systems

arXiv 2024

2024

Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication

arXiv 2024

2024

More Expressive Attention with Negative Weights

arXiv 2024

2024

Continuous Speech Tokenizer in Text To Speech

arXiv 2024

2024

Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence

arXiv 2024

2024

Advancing LLM Reasoning Generalists with Preference Trees

arXiv 2024

2024

Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment

arXiv 2024

2024

AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

arXiv 2023

2023

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

arXiv 2023

2023

MAVEN-Arg: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument Annotation

arXiv 2023

2023

UltraFeedback: Boosting Language Models with High-quality Feedback

ICML

2023

Selective Fairness in Recommendation via Prompts

arXiv 2022

2022

Pruning Pre-trained Language Models Without Fine-Tuning

arXiv 2022

2022

Affiliations

No known affiliations.

Frequent co-authors

10

from 23 papers