Alicia Parrish
- Papers
- 8
Cite
Notes
Only stored in your browser.
8papers
Authored papers
8MSTS: A Multimodal Safety Test Suite for Vision-Language Models
arXiv 2025
Introducing v0.5 of the AI Safety Benchmark from MLCommons
arXiv 2024
DICES Dataset: Diversity in Conversational AI Evaluation for Safety
arXiv 2023
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
TMLR
DataPerf: Benchmarks for Data-Centric AI Development
dataperf-benchmarks-for-data-centric-ai
QuALITY: Question Answering with Long Input Texts, Yes!
NAACL 2022 7
BBQ: A Hand-Built Bias Benchmark for Question Answering
Findings (ACL) 2022 5
BLiMP: The Benchmark of Linguistic Minimal Pairs for English
blimp-the-benchmark-of-linguistic-minimal
Affiliations
No known affiliations.
Frequent co-authors
10from 8 papers