Liwei Jiang
- Papers
- 15
Cite
Notes
Only stored in your browser.
Authored papers
15X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents
arXiv 2025
WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
arXiv 2024
WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models
arXiv 2024
AI as Humanity's Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text
arXiv 2024
A Roadmap to Pluralistic Alignment
arXiv 2024
Faith and Fate: Limits of Transformers on Compositionality
faith-and-fate-limits-of-transformers-on
Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement
arXiv 2023
Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
arXiv 2023
Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties
arXiv 2023
Quark: Controllable Text Generation with Reinforced Unlearning
arXiv 2022
ProsocialDialog: A Prosocial Backbone for Conversational Agents
arXiv 2022
SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization
arXiv 2022
ClarifyDelphi: Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations
arXiv 2022
NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics
NAACL 2022 7
Symbolic Knowledge Distillation: from General Language Models to Commonsense Models
NAACL 2022 7
Affiliations
Frequent co-authors
10from 15 papers
Yejin Choi
professor
Ximing Lu
Nouha Dziri
researcher
Peter West
Chandra Bhagavatula
Sean Welleck
Allyson Ettinger
Maarten Sap
Ronan Le Bras
Bill Yuchen Lin
researcher