Computational antibody design has seen rapid methodological progress, with dozens of deep generative methods proposed in the past three years, yet the field lacks a standardized benchmark for fair comparison and model development. These methods are evaluated on different SAbDab snapshots, non-overlapping test sets, and incompatible metrics, and the literature fragments the design problem into numerous sub-tasks with no common definition. We introduce CHIMERA-Bench: (CDR Modeling with Epitope-guided Redesign), a unified benchmark built around a single canonical task: epitope-conditioned CDR sequence-structure co-design. CHIMERA-Bench provides three components. The first is a curated, deduplicated dataset of 2,922 antibody-antigen complexes with epitope and paratope annotations. The second is a set of three biologically motivated splits that test generalization to unseen epitopes, unseen antigen folds, and prospective temporal targets. The third is a comprehensive evaluation protocol with five metric groups, including novel epitope-specificity measures. We benchmark eleven methods spanning six generative paradigms and report results across all splits. CHIMERA-Bench is the largest dataset of its kind for the antibody design problem, allowing the community to develop and test novel methods and evaluate their generalizability.
CHIMERA-Bench: A Benchmark Dataset for Epitope-Specific Antibody Design
Computational antibody design has seen rapid methodological progress, with dozens of deep generative methods proposed in the past three years, yet the field lacks a standardized benchmark for fair comparison and model development.
- Preview

- Year
- 2026
- Hosting
- Full text hostedCC-BY-4.0
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2603.13431CC-BY-4.0
- TL;DR
- Semantic Scholar