0

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents

Active

Assesses whether AI agents can be hijacked by malicious third parties using prompt injections in simple environments such as a workspace or travel booking app.

Domain
Safeguards
License
mit
Published
Jul 2025
Notable for
Benchmark for evaluating Safeguards.

Cite

Notes

Only stored in your browser.

Related tools

4
View all

Implementations, trainers, datasets and scaffolds linked to this eval.

FAQ

What is AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents?
Assesses whether AI agents can be hijacked by malicious third parties using prompt injections in simple environments such as a workspace or travel booking app.
How can a model improve its AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents score?
Tools linked to AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents on Sophon include Agent DOJO RL Env (Prime Community), Agent DOJO RL Env (Prime Intellect), CASA House RL Env (Community), BE LIKE RL Env (Community) - RL environments, datasets, and scaffolds that target this eval.
What license is AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents under?
AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents is available under mit.