Cite
Notes
Only stored in your browser.
Attribution
Surgical Post-Training: Cutting Errors, Keeping Knowledge
arXiv 2026
ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models
arXiv 2025
Beyond Outcomes: Transparent Assessment of LLM Reasoning in Games
arXiv 2024
from 3 papers
Kai Han
Jonathan Roberts
Samuel Albanie
Aaditya Baranwal
Akash Gupta
Alexander Lo
Alexandru Coca
Anh Totti Nguyen
Ansh Sharma
Brian Pulfer