Ruizhe Shi

Cite

Notes

Only stored in your browser.

Attribution

2papers

Authored papers

Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO

arXiv 2025

Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning

arXiv 2023

No known affiliations.

from 2 papers

Simon S. Du

Huazhe Xu

Maryam Fazel

Minhak Song

Runlong Zhou

Yanjie Ze

Yuyao Liu

Zihan Zhang