Xiaoran Fan
- Papers
- 13
Cite
Notes
Only stored in your browser.
Authored papers
13Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment
arXiv 2026
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination
arXiv 2025
BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset
arXiv 2025
Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments
arXiv 2025
Critique-RL: Training Language Models for Critiquing through Two-Stage Reinforcement Learning
arXiv 2025
Distill Visual Chart Reasoning Ability from LLMs to MLLMs
arXiv 2024
StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
arXiv 2024
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
arXiv 2024
MouSi: Poly-Visual-Expert Vision-Language Models
arXiv 2024
ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios
arXiv 2024
Secrets of RLHF in Large Language Models Part II: Reward Modeling
arXiv 2024
LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin
arXiv 2023
The Rise and Potential of Large Language Model Based Agents: A Survey
arXiv 2023
Affiliations
Frequent co-authors
10from 13 papers