0

Can ChatGPT Detect DeepFakes? A Study of Using Multimodal Large Language Models for Media Forensics

Multimodal large language models can effectively detect AI-generated DeepFakes through prompt engineering without specific programming.

Year
2024
Venue
arXiv 2024
Authors
10
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2403.14077v4ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

DeepFakes, which refer to AI-generated media content, have become an increasing concern due to their use as a means for disinformation. Detecting DeepFakes is currently solved with programmed machine learning algorithms. In this work, we investigate the capabilities of multimodal large language models (LLMs) in DeepFake detection. We conducted qualitative and quantitative experiments to demonstrate multimodal LLMs and show that they can expose AI-generated images through careful experimental design and prompt engineering. This is interesting, considering that LLMs are not inherently tailored for media forensic tasks, and the process does not require programming. We discuss the limitations of multimodal LLMs for these tasks and suggest possible improvements.

Authors

10