[Paper] HarmonicAttack: An Adaptive Cross-Domain Audio Watermark Removal

Published: (November 26, 2025 at 11:51 AM EST)
4 min read
Source: arXiv

Source: arXiv - 2511.21577v1

Overview

The paper presents HarmonicAttack, a new technique for stripping watermarks from AI‑generated audio. By showing that watermarks can be removed quickly and with limited prior knowledge, the work forces a re‑examination of how robust current audio‑watermarking defenses really are—an issue that matters for anyone building or defending voice‑based AI products.

Key Contributions

  • Adaptive removal pipeline that only needs the ability to generate watermarks from a target scheme (no secret keys or internal model details).
  • Dual‑path convolutional autoencoder that processes audio simultaneously in the time domain and the frequency (spectral) domain, improving separation of watermark and content.
  • GAN‑style training that encourages the model to produce clean, natural‑sounding audio while suppressing watermark artifacts.
  • Cross‑scheme generalization: a single trained model can remove watermarks from any sample produced by the targeted scheme, and it transfers reasonably well to out‑of‑distribution audio.
  • Near real‑time performance: inference runs fast enough for interactive or batch processing scenarios, unlike many prior, computationally heavy attacks.

Methodology

  1. Assumption – The attacker can call the watermarking algorithm (e.g., AudioSeal, WavMark) to embed watermarks on arbitrary clean audio. This is realistic because many watermarking services are publicly available.
  2. Data generation – The authors synthesize paired datasets: clean audio ↔ watermarked audio, covering a wide variety of speakers, music, and environmental sounds.
  3. Model architecture
    • Temporal branch: a 1‑D convolutional encoder‑decoder that captures waveform‑level patterns.
    • Spectral branch: a 2‑D convolutional encoder‑decoder that works on short‑time Fourier transform (STFT) magnitude maps, targeting frequency‑domain watermark signatures.
    • The two branches are fused before the decoder output, allowing the network to exploit complementary cues.
  4. Training objective
    • Reconstruction loss (L1/L2) to keep the de‑watermarked audio close to the original clean signal.
    • Adversarial loss from a discriminator that distinguishes real clean audio from the model’s output, pushing the generator toward perceptual realism.
    • Watermark suppression loss that penalizes residual watermark patterns detected by a lightweight watermark detector.
  5. Evaluation – The trained model is tested on unseen watermarked clips from three state‑of‑the‑art schemes, measuring both watermark detection rates after attack and audio quality (PESQ, STOI, MOS).

Results & Findings

Watermark SchemeDetection Rate Before AttackDetection Rate After HarmonicAttackPESQ (clean → attacked)
AudioSeal96 %12 %4.3 → 4.1
WavMark94 %8 %4.2 → 4.0
Silentcipher92 %10 %4.1 → 3.9
  • HarmonicAttack consistently reduces watermark detectability to single‑digit percentages, outperforming prior removal baselines by 30‑45 % absolute.
  • Audio quality degradation is minimal; subjective listening tests show > 80 % of participants cannot tell the difference from the original.
  • Inference runs at ~0.8 × real‑time on a single GPU (≈ 25 ms per second of audio), making it practical for large‑scale batch processing.
  • Transfer experiments (different speakers, languages, or unseen background noises) show only a ~5 % drop in removal effectiveness, indicating good generalization.

Practical Implications

  • For watermark designers: The results expose a concrete attack surface—if a watermark can be re‑generated, an adversary can train a removal model without ever seeing the secret key. Designers must therefore consider non‑reversible or cryptographically bound embeddings that cannot be trivially reproduced.
  • For AI‑generated media platforms: Relying solely on watermark detection as a compliance check is risky. Complementary provenance methods (e.g., secure logging, blockchain‑based fingerprints) become essential.
  • For developers of voice‑cloning or deep‑fake detection tools: HarmonicAttack can be used as a benchmark to stress‑test detection pipelines, ensuring they remain robust when attackers first strip watermarks.
  • For security auditors: The dual‑path autoencoder architecture is lightweight enough to be integrated into automated audit pipelines that scan large audio corpora for hidden watermarks or their removal.

Limitations & Future Work

  • Assumes access to the watermark generator – While realistic for open‑source schemes, proprietary or hardware‑locked watermarks may not be reproducible.
  • Focuses on three watermark families – The attack’s efficacy against future, more sophisticated schemes (e.g., adaptive, content‑aware embeddings) remains untested.
  • Audio‑only domain – Extending the approach to multimodal media (video with audio watermarks) or to streaming scenarios with low‑latency constraints is an open challenge.
  • Potential arms race – The authors suggest exploring adversarial watermarking where the embedding process is trained jointly with a removal model, akin to GANs, to harden watermarks against this class of attacks.

Bottom line: HarmonicAttack shows that current audio watermarking methods can be peeled away with relatively modest resources, prompting a rethink of how we protect AI‑generated voice content in real‑world deployments.

Authors

  • Kexin Li
  • Xiao Hu
  • Ilya
Back to Blog

Related posts

Read more »