0

Self-Supervised Video Forensics by Audio-Visual Anomaly Detection

A video forensics method uses an autoregressive model trained on unlabeled real videos to detect inconsistencies between audio and visual signals.

Year
2023
Venue
CVPR 2023 1
Authors
3
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2301.01767v2ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

Manipulated videos often contain subtle inconsistencies between their visual and audio signals. We propose a video forensics method, based on anomaly detection, that can identify these inconsistencies, and that can be trained solely using real, unlabeled data. We train an autoregressive model to generate sequences of audio-visual features, using feature sets that capture the temporal synchronization between video frames and sound. At test time, we then flag videos that the model assigns low probability. Despite being trained entirely on real videos, our model obtains strong performance on the task of detecting manipulated speech videos. Project site: https://cfeng16.github.io/audio-visual-forensics

Authors

3