0

SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven Manner

A novel data synthesis framework, SWE-Flow, uses unit tests to automatically infer development steps and generate a structured schedule for Test-Driven Development (TDD), significantly improving the performance of open models fine-tuned on real-world projects.

Year
2025
Venue
arXiv 2025
Authors
9
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2506.09003v2ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

We introduce SWE-Flow, a novel data synthesis framework grounded in Test-Driven Development (TDD). Unlike existing software engineering data that rely on human-submitted issues, SWE-Flow automatically infers incremental development steps directly from unit tests, which inherently encapsulate high-level requirements. The core of SWE-Flow is the construction of a Runtime Dependency Graph (RDG), which precisely captures function interactions, enabling the generation of a structured, step-by-step development schedule. At each step, SWE-Flow produces a partial codebase, the corresponding unit tests, and the necessary code modifications, resulting in fully verifiable TDD tasks. With this approach, we generated 16,061 training instances and 2,020 test instances from real-world GitHub projects, creating the SWE-Flow-Eval benchmark. Our experiments show that fine-tuning open model on this dataset significantly improves performance in TDD-based coding. To facilitate further research, we release all code, datasets, models, and Docker images at Github.

Authors

9