Kianté Brantley
- Papers
- 6
Cite
Notes
Only stored in your browser.
6papers
Authored papers
6Value-Guided Search for Efficient Chain-of-Thought Reasoning
arXiv 2025
Dataset Reset Policy Optimization for RLHF
arXiv 2024
REBEL: Reinforcement Learning via Regressing Relative Rewards
arXiv 2024
Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF
arXiv 2024
Learning to Generate Better Than Your LLM
arXiv 2023
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
arXiv 2022
Affiliations
No known affiliations.
Frequent co-authors
10from 6 papers