[Paper] RAVEN: Erasing Invisible Watermarks via Novel View Synthesis

Published: (January 13, 2026 at 01:59 PM EST)
4 min read
Source: arXiv

Source: arXiv - 2601.08832v1

Overview

Invisible watermarks are increasingly used to prove the provenance of AI‑generated images, but their robustness against clever attacks is still an open question. The paper “RAVEN: Erasing Invisible Watermarks via Novel View Synthesis” shows that a watermark can be stripped away simply by generating a new, slightly shifted view of the same scene—much like looking at an object from a different angle. By treating watermark removal as a view‑synthesis problem, the authors expose a fundamental weakness in current watermark designs and propose a diffusion‑based, zero‑shot attack that works without any knowledge of the watermark or its detector.

Key Contributions

  • Reframing watermark removal as a novel view synthesis task, demonstrating that semantic‑preserving geometric changes naturally erase invisible marks.
  • RAVEN framework: a zero‑shot diffusion pipeline that performs controlled latent‑space transformations guided by a view‑correspondence attention module, preserving structure while discarding the watermark.
  • Model‑agnostic attack: works on frozen pre‑trained diffusion models, requiring no access to the watermark detector, the watermark key, or any training on the target watermarking method.
  • Comprehensive evaluation across 15 state‑of‑the‑art invisible watermarking schemes, beating 14 baseline removal attacks in both watermark suppression and visual quality.
  • Open‑source release of code and pretrained components, enabling reproducible research and facilitating the development of more resilient watermarking techniques.

Methodology

  1. Latent‑Space View Perturbation – Starting from the latent representation of an input image (produced by a pre‑trained diffusion model), RAVEN applies a small geometric transformation (e.g., a slight rotation or translation) that mimics a new camera viewpoint.
  2. View‑Guided Correspondence Attention – To keep the reconstructed image faithful to the original content, an attention module aligns patches between the original and transformed latent maps, ensuring that edges, textures, and object layouts stay consistent.
  3. Diffusion‑Based Reconstruction – The perturbed latent is fed back through the diffusion decoder, which synthesizes a high‑fidelity image of the “new view.” Because the watermark is tightly coupled to the original pixel arrangement, the viewpoint shift effectively disrupts its embedding while leaving the visual scene intact.
  4. Zero‑Shot Operation – No fine‑tuning or watermark‑specific training is required; the pipeline can be applied directly to any image generated by a diffusion model, making it a universal removal tool.

Results & Findings

  • Watermark Suppression: RAVEN reduces detection rates of 15 watermarking methods by an average of 78 %, outperforming the strongest baseline (a frequency‑domain filter) by +12 % in suppression.
  • Perceptual Quality: Measured by LPIPS and SSIM, the synthesized images retain >0.95 SSIM and <0.08 LPIPS, indicating that visual fidelity is largely unchanged.
  • Robustness Across Datasets: Experiments on COCO, LAION‑Aesthetics, and a proprietary AI‑art dataset show consistent performance, confirming that the attack is not limited to a specific image domain.
  • Ablation Studies: Removing the correspondence attention module drops SSIM by ~0.07, highlighting its role in preserving structural consistency. Varying the magnitude of the view shift reveals a sweet spot (≈2–3 ° rotation or 5 % translation) where watermark removal is maximized without perceptible artifacts.

Practical Implications

  • For Platform Engineers: Current invisible watermarking schemes that rely solely on pixel‑space or frequency‑domain robustness may be insufficient. Systems need to consider attacks that alter the semantic geometry of images.
  • For Watermark Designers: Embedding strategies must become view‑invariant—for example, by distributing the watermark across 3‑D scene representations or using multi‑view consistency checks during verification.
  • For Developers of Generative Models: The RAVEN pipeline can be integrated as a diagnostic tool to stress‑test any new watermarking method before deployment.
  • Legal & Compliance: The existence of a low‑cost, zero‑shot removal attack could affect the evidentiary weight of invisible watermarks in copyright disputes, prompting a re‑evaluation of reliance on such marks alone.

Limitations & Future Work

  • Dependence on Diffusion Models: RAVEN assumes access to a diffusion decoder compatible with the image source; applying it to non‑diffusion generators may require additional adaptation.
  • Small View Perturbations: Extremely large viewpoint changes can introduce noticeable distortions, limiting the attack’s stealthiness for certain image types (e.g., highly structured graphics).
  • Counter‑Measures Not Explored: The paper does not propose concrete watermark designs that resist view‑synthesis attacks, leaving that as an open research direction.
  • Future Directions: Extending the approach to video frames, investigating adversarial training of watermarks against view synthesis, and exploring hybrid attacks that combine geometric and frequency‑domain perturbations.

Authors

  • Fahad Shamshad
  • Nils Lukas
  • Karthik Nandakumar

Paper Information

  • arXiv ID: 2601.08832v1
  • Categories: cs.CV
  • Published: January 13, 2026
  • PDF: Download PDF
Back to Blog

Related posts

Read more »