Steven Basart

Researcher at Center for AI Safety; co-author of MMLU, MATH, and multiple safety benchmarks; longtime collaborator of Dan Hendrycks.

Role: researcher
Currently at: Center for AI Safety
Twitter: twitter.com/xksteven
GitHub: github.com/xksteven
Scholar: scholar.google.com/citations
Papers: 9

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: scholar.google.com/citations

Attribution policy →

9papers

Authored papers

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

arXiv 2024

2024

Representation Engineering: A Top-Down Approach to AI Transparency

arXiv 2023

2023

Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark

arXiv 2023

2023

Measuring Mathematical Problem Solving With the MATH Dataset

NeurIPS

2021

Measuring Coding Challenge Competence With APPS

arXiv 2021

2021

Measuring Massive Multitask Language Understanding

ICLR

2020

The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization

ICCV 2021 10

2020

Aligning AI With Shared Human Values

arXiv 2020

2020

Natural Adversarial Examples

CVPR 2021 1

2019

Affiliations

Currently at

Center for AI Safety

researcher · non profit

Previously

University of California, Berkeleyuniversity lab

Frequent co-authors

from 9 papers

Dan Hendrycks

director

9 shared papers

Dawn Song

professor

7 shared papers

Jacob Steinhardt

founder

6 shared papers

Andy Zou

founder

4 shared papers

Collin Burns

researcher

4 shared papers

Mantas Mazeika

researcher

4 shared papers

Nathaniel Li

grad-student

3 shared papers

Saurav Kadavath

researcher

3 shared papers

Akul Arora

researcher

2 shared papers

Alexander Pan

2 shared papers