Dean Lee

Cite

Notes

Only stored in your browser.

Attribution

2papers

Authored papers

MultiChallenge: A Realistic Multi-Turn Conversation Evaluation Benchmark Challenging to Frontier LLMs

arXiv 2025

The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems

arXiv 2025

No known affiliations.

from 2 papers

Summer Yue

researcher

Adam Khoja

Alice Gatti

researcher

Arunim Agarwal

Brad Kenstler

Chen Xing

Cristina Menghini

Dan Hendrycks

director

Ed-Yeremai Cardona

Eduardo Trevino