Florian Tramer

Papers: 13

Cite

Notes

Only stored in your browser.

Attribution

Affiliations & profile: Semantic Scholar

Attribution policy →

13papers

Authored papers

Defeating Prompt Injections by Design

arXiv 2025

2025

Apertus: Democratizing Open and Compliant LLMs for Global Language Environments

arXiv 2025

2025

The Jailbreak Tax: How Useful are Your Jailbreak Outputs?

arXiv 2025

2025

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

arXiv 2024

2024

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents

arXiv 2024

2024

Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs

arXiv 2024

2024

Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition

arXiv 2024

2024

Evading Black-box Classifiers Without Breaking Eggs

arXiv 2023

2023

Evaluating Superhuman Models with Consistency Checks

arXiv 2023

2023

Universal Jailbreak Backdoors from Poisoned Human Feedback

arXiv 2023

2023

Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems

arXiv 2022

2022

Large Language Models Can Be Strong Differentially Private Learners

large-language-models-can-be-strong

2021

Extracting Training Data from Large Language Models

arXiv 2020

2020

Affiliations

No known affiliations.

Frequent co-authors

from 13 papers

Edoardo Debenedetti

Nicholas Carlini

Javier Rando

Daniel Paleka

Francesco Croce

Jie Zhang

Maksym Andriushchenko

Nicolas Flammarion

Adam Roberts

Ahmed Salem