[Paper] EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing

Published: 22 hours ago (March 19, 2026 at 01:59 PM EDT)

2 min read

Source: arXiv

Source: arXiv - 2603.19224v1

Overview

Video object removal aims to eliminate dynamic target objects and their visual effects, such as deformation, shadows, and reflections, while restoring seamless backgrounds. Recent diffusion-based video inpainting and object removal methods can remove the objects but often struggle to erase these effects and to synthesize coherent backgrounds. Beyond method limitations, progress is further hampered by the lack of a comprehensive dataset that systematically captures common object effects across varied environments for training and evaluation. To address this, we introduce VOR (Video Object Removal), a large-scale dataset that provides diverse paired videos, each consisting of one video where the target object is present with its effects and a counterpart where the object and effects are absent, with corresponding object masks. VOR contains 60K high-quality video pairs from captured and synthetic sources, covers five effects types, and spans a wide range of object categories as well as complex, dynamic multi-object scenes. Building on VOR, we propose EffectErase, an effect-aware video object removal method that treats video object insertion as the inverse auxiliary task within a reciprocal learning scheme. The model includes task-aware region guidance that focuses learning on affected areas and enables flexible task switching. Then, an insertion-removal consistency objective that encourages complementary behaviors and shared localization of effect regions and structural cues. Trained on VOR, EffectErase achieves superior performance in extensive experiments, delivering high-quality video object effect erasing across diverse scenarios.

Key Contributions

This paper presents research in the following areas:

cs.CV

Methodology

Please refer to the full paper for detailed methodology.

Practical Implications

This research contributes to the advancement of cs.CV.

Authors

Yang Fu
Yike Zheng
Ziyun Dai
Henghui Ding

Paper Information

arXiv ID: 2603.19224v1
Categories: cs.CV
Published: March 19, 2026
PDF: Download PDF

[Paper] EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing

Overview

Key Contributions

Methodology

Practical Implications

Authors

Paper Information

Related posts

[Paper] Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding

[Paper] Matryoshka Gaussian Splatting

[Paper] Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens

[Paper] MonoArt: Progressive Structural Reasoning for Monocular Articulated 3D Reconstruction