Maarten Sap

Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning

arXiv 2025

AutoPresent: Designing Structured Visuals from Scratch

CVPR 2025 1

Medical Hallucinations in Foundation Models and Their Impact on Healthcare

arXiv 2025

WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

arXiv 2024

NormAd: A Framework for Measuring the Cultural Adaptability of Large Language Models

arXiv 2024

PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models

arXiv 2024

SOTOPIA-$π$: Interactive Learning of Socially Intelligent Language Agents

arXiv 2024

Leftover Lunch: Advantage-based Offline Reinforcement Learning for Language Models

arXiv 2023

2023

Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties

arXiv 2023

2023

NLPositionality: Characterizing Design Biases of Datasets and Models

arXiv 2023

2023

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

TMLR

ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection

ACL 2022 5

SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization

arXiv 2022

When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

arXiv 2022

Detoxifying Text with MaRCo: Controllable Revision with Experts and Anti-Experts

arXiv 2022

ProsocialDialog: A Prosocial Backbone for Conversational Agents

arXiv 2022

Challenges in Automated Debiasing for Toxic Language Detection

EACL 2021 2

2021

DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts

ACL 2021 5

2021

Affiliations

No known affiliations.

Frequent co-authors

from 19 papers

Yejin Choi

professor

8 shared papers

Ximing Lu

6 shared papers

Faeze Brahman

researcher

4 shared papers

Liwei Jiang

4 shared papers

Ronan Le Bras

4 shared papers

Alisa Liu

researcher

Noah A. Smith

Shuyue Stella Li

Xuhui Zhou

researcher