Cite
Notes
Only stored in your browser.
Attribution
OfficeQA Pro: An Enterprise Benchmark for End-to-End Grounded Reasoning
arXiv 2026
Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models
arXiv 2023
from 2 papers
Arnav Singhvi
Cindy Wang
Erich Elsen
Faeze Brahman
researcher
Ivan Zhou
Jacob Portes
Jasmine Collins
Krista Opsahl-Ong
Maarten Sap
Mark Riedl