ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

Published: 1 month ago (January 3, 2026 at 09:50 AM EST)

1 min read

Source: Dev.to

Overview

ZoeDepth predicts depth from a single image, handling both near and far objects accurately. It combines two learning strategies: one that preserves metric scale and another that captures relative shape.

Method

Dual learning: The model is first trained on large, diverse datasets using a relative‑depth objective, then fine‑tuned with metric‑depth supervision to retain real‑world scale.
Dynamic specialist selection: An internal selector chooses the most suitable “tiny specialist” for each input image, ensuring optimal performance across varied scenes.

Generalization and Performance

Trained on many datasets, ZoeDepth generalizes well to unseen indoor and outdoor environments.
Demonstrates strong zero‑shot depth estimation, outperforming prior methods without additional training.

Applications

Photo‑editing effects that rely on depth cues.
Space measurement tools for interior design or architecture.
Robotics and navigation systems that need reliable depth perception.

Availability

The code and pretrained models are publicly released for community use.

ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

Overview

Method

Generalization and Performance

Applications

Availability

Related posts

Global Attention Mechanism: Retain Information to Enhance Channel-SpatialInteractions

Feature Detection, Part 3: Harris Corner Detection

Mish: A Self Regularized Non-Monotonic Activation Function

Computer Vision Services: Building Intelligent Visual Systems with Oodles