StableMotion: One-Step Motion Estimation with Diffusion Prior

We present StableMotion, a novel framework that leverages geometric and content priors from pretrained large-scale image diffusion models for motion estimation in single-image rectification tasks such as Stitched Image Rectangling (SIR) and Rolling Shutter Correction (RSC). Specifically, StableMotion takes a text-to-image Stable Diffusion (SD) model as its backbone and repurposes it as an image-to-motion estimator. To mitigate inconsistent outputs produced by diffusion models, we propose Adaptive Ensemble Strategy (AES), which consolidates multiple outputs into a cohesive, high-fidelity result. Additionally, we present Sampling Steps Disaster (SSD), a counterintuitive phenomenon in which increasing the number of sampling steps can lead to poorer outcomes, motivating our one-step inference design. StableMotion is evaluated on two image rectification tasks and delivers state-of-the-art performance on both, while also showing promising transferability through qualitative examples and no-reference evaluations on unseen SIR-OOD and real-captured RSC benchmarks. Supported by SSD, StableMotion achieves efficient one-step inference, offering over 100\times speedup compared to previous diffusion model-based methods even when combined with the optional AES post-processing. Code and weights are available at https://github.com/ivowang/StableMotion.