Dipendra Misra
- Papers
- 6
Cite
Notes
Only stored in your browser.
6papers
Authored papers
6Dataset Reset Policy Optimization for RLHF
arXiv 2024
Aligning LLM Agents by Learning Latent Preference from User Edits
arXiv 2024
Policy Improvement using Language Feedback Models
arXiv 2024
Towards Principled Representation Learning from Videos for Reinforcement Learning
arXiv 2024
Learning to Generate Better Than Your LLM
arXiv 2023
The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
arXiv 2023
Affiliations
No known affiliations.
Frequent co-authors
10from 6 papers