Cite
Notes
Only stored in your browser.
Attribution
Learning Robust Social Strategies with Large Language Models
arXiv 2025
The Markovian Thinker
DeepSeek-R1 Thoughtology: Let's think about LLM Reasoning
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment
arXiv 2024
from 4 papers
Aaron Courville
Amirhossein Kazemnejad
Siva Reddy
Alessandro Sordoni
Aditi Khandelwal
Arkil Patel
Austin Kraft
Benno Krojer
Dereck Piche
Dongchan Shin
researcher