This paper introduces an effective method for computation-efficient personalized style video generation without requiring access to any personalized video data. It reduces the necessary generation time of similarly sized video diffusion models from 25 seconds to around 1 second while maintaining the same level of performance. The method's effectiveness lies in its dual-level decoupling learning approach: 1) separating the learning of video style from video generation acceleration, which allows for personalized style video generation without any personalized style video data, and 2) separating the acceleration of image generation from the acceleration of video motion generation, enhancing training efficiency and mitigating the negative effects of low-quality video data.
AnimateLCM: Computation-Efficient Personalized Style Video Generation without Personalized Video Data
AnimateLCM, an extension of Latent Consistency Model, achieves high-fidelity video generation efficiently with decoupled consistency learning and adaptability to existing adapters, outperforming in image-conditioned and layout-conditioned video generation.
- Year
- 2024
- Venue
- arXiv 2024
- Authors
- 8
- Hosting
- Abstract onlyARXIV-DEFAULT
Cite
Notes
Only stored in your browser.
Attribution
- Abstract & full text
- arxiv.org/abs/2402.00769v3ARXIV-DEFAULT
- TL;DR
- Semantic Scholar