Sanjiban Choudhury
- Papers
- 11
Cite
Notes
Only stored in your browser.
Authored papers
11Process Reward Models for LLM Agents: Practical Framework and Directions
arXiv 2025
Robotouille: An Asynchronous Planning Benchmark for LLM Agents
arXiv 2025
A Smooth Sea Never Made a Skilled $\texttt{SAILOR}$: Robust Imitation via Learning to Search
arXiv 2025
Aligning LLMs with Domain Invariant Reward Models
arXiv 2025
Multi-Turn Code Generation Through Single-Step Rewards
arXiv 2025
One-Shot Imitation under Mismatched Execution
arXiv 2024
Inverse Reinforcement Learning without Reinforcement Learning
arXiv 2023
A Game-Theoretic Framework for Joint Forecasting and Planning
arXiv 2023
The Virtues of Laziness in Model-based RL: A Unified Objective and Algorithms
arXiv 2023
Learning Shared Safety Constraints from Multi-task Demonstrations
learning-shared-safety-constraints-from-multi
Demo2Code: From Summarizing Demonstrations to Synthesizing Code via Extended Chain-of-Thought
arXiv 2023
Affiliations
Frequent co-authors
10from 11 papers