Cite
Notes
Only stored in your browser.
Attribution
Accelerating Unbiased LLM Evaluation via Synthetic Feedback
arXiv 2025
Free from Bellman Completeness: Trajectory Stitching via Model-based Return-conditioned Supervised Learning
arXiv 2023
from 2 papers
Abhishek Gupta
Andrea Zanette
Chuning Zhu
Qiwen Cui
Runlong Zhou
Simon Shaolei Du
Yuda Song