Cite
Notes
Only stored in your browser.
Attribution
Training Language Models to Self-Correct via Reinforcement Learning
arXiv 2024
Transformers Meet Directed Graphs
arXiv 2023
from 2 papers
Aleksandra Faust
Ali Taylan Cemgil
Avi Singh
Aviral Kumar
Colton Bishop
Daniel Mankowitz
Disha Shrivastava
Doina Precup
Feryal Behbahani
George Tucker