0

ContraFix: Skill-Enhanced Contrastive Runtime Analysis for Vulnerability Repair

As software systems grow increasingly complex, automated vulnerability repair (AVR) remains difficult because the materials available to a repair system are usually failure artifacts rather than repair guidance.

Preview
Year
2026
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2605.17450ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

As software systems grow increasingly complex, automated vulnerability repair (AVR) remains difficult because the materials available to a repair system are usually failure artifacts rather than repair guidance. Traditional analysis techniques can provide suspicious locations, reduced triggers, or constraints, but they are costly to configure across repositories and seldom directly actionable for patch generation. Recent LLM-based agents can edit and validate repository-level patches, and experience-based systems can reuse prior repair traces or demonstrations, but they still need current-instance evidence that turns a broad, symptom-level failure report into a concrete repair decision. We present ContraFix, an agentic AVR framework that constructs such evidence through contrastive runtime analysis. Starting from a failing witness, ContraFix generates nearby failing and non-failing variants, executes them through aligned probe sites, and compares their runtime states to infer the repair boundary and guide source-level patching. Each candidate patch is accepted only after build and validation. ContraFix also stores validated repair episodes in a dual-track skill base, reusing mutation skills to construct useful variants and correction skills to refine failed patches. On SEC-Bench, ContraFix with GPT-5-mini achieves resolution rate of 92.0% over three repeated runs and an average resolution rate of 91.8% +/- 0.8. On PatchEval, it resolves 73.8% of 225 Go, Python, and JavaScript instances. A semantic audit of benchmark-validated SEC-Bench patches shows that 58.2% of ContraFix's patches are semantically correct, compared with 31.3% for the strongest baseline, indicating that the proposed framework improves semantic correctness beyond benchmark validation.