0

Mask Image Watermarking

MaskMark, an efficient framework for image watermarking, achieves superior performance in various tasks including global and local watermark embedding and extraction, while maintaining high image quality and flexibility.

Year
2025
Venue
arXiv 2025
Authors
8
Hosting
Abstract onlyARXIV-DEFAULT

Cite

Notes

Only stored in your browser.

Attribution

Abstract & full text
arxiv.org/abs/2504.12739v2ARXIV-DEFAULT
TL;DR
Semantic Scholar
Attribution policy →

Abstract

We present MaskMark, a simple, efficient, and flexible framework for image watermarking. MaskMark has two variants: (1) MaskMark-D, which supports global watermark embedding, watermark localization, and local watermark extraction for applications such as tamper detection; (2) MaskMark-ED, which focuses on local watermark embedding and extraction, offering enhanced robustness in small regions to support fine-grined image protection. MaskMark-D builds on the classical encoder-distortion layer-decoder training paradigm. In MaskMark-D, we introduce a simple masking mechanism during the decoding stage that enables both global and local watermark extraction. During training, the decoder is guided by various types of masks applied to watermarked images before extraction, helping it learn to localize watermarks and extract them from the corresponding local areas. MaskMark-ED extends this design by incorporating the mask into the encoding stage as well, guiding the encoder to embed the watermark in designated local regions, which improves robustness under regional attacks. Extensive experiments show that MaskMark achieves state-of-the-art performance in global and local watermark extraction, watermark localization, and multi-watermark embedding. It outperforms all existing baselines, including the recent leading model WAM for local watermarking, while preserving high visual quality of the watermarked images. In addition, MaskMark is highly efficient and adaptable. It requires only 20 hours of training on a single A6000 GPU, achieving 15x computational efficiency compared to WAM. By simply adjusting the distortion layer, MaskMark can be quickly fine-tuned to meet varying robustness requirements.

Authors

8