TextArena: Multi-Agent Text-Based Games for LLM Evaluation

Open-source library of 100+ text-based multi-agent games (negotiation, deception, strategy) for evaluating LLMs in head-to-head interactive settings.

Open

Publisher: Singapore A*STAR (Agency for Science, Technology and Research)
Year: 2025
Venue: preprint
ArXiv: arxiv.org/abs/2502.06545
Code: github.com/LeonGuertler/TextArena
Authors: 6
Hosting: External sourcelicense unknown

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text: arxiv.org/abs/2502.06545
TL;DR: semanticscholar.org/paper/3b17710adc9628139ed76bd83aa70e050a15be35
Code: github.com/LeonGuertler/TextArena

Attribution policy →

Introduces 3 artifacts - 1 eval, 2 tools

TL;DR

Semantic Scholar

This work proposes a universal preconditioning method that convolves the target with coefficients from orthogonal polynomials such as Chebyshev or Legendre and proves that this approach reduces regret for two distinct prediction algorithms and yields the first ever sublinear and hidden-dimension-independent regret bounds.

Artifacts

Evals

TextArena

Tools

Openenv Textarena RL Env (Hugging Face)TextArena

Authors

Andy Pearce Bo Liu Bobby Cheng Cheston Tan Leon Guertler Simon Yu