ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

Published: (January 3, 2026 at 09:50 AM EST)
1 min read
Source: Dev.to

Source: Dev.to

Overview

ZoeDepth predicts depth from a single image, handling both near and far objects accurately. It combines two learning strategies: one that preserves metric scale and another that captures relative shape.

Method

  • Dual learning: The model is first trained on large, diverse datasets using a relative‑depth objective, then fine‑tuned with metric‑depth supervision to retain real‑world scale.
  • Dynamic specialist selection: An internal selector chooses the most suitable “tiny specialist” for each input image, ensuring optimal performance across varied scenes.

Generalization and Performance

  • Trained on many datasets, ZoeDepth generalizes well to unseen indoor and outdoor environments.
  • Demonstrates strong zero‑shot depth estimation, outperforming prior methods without additional training.

Applications

  • Photo‑editing effects that rely on depth cues.
  • Space measurement tools for interior design or architecture.
  • Robotics and navigation systems that need reliable depth perception.

Availability

The code and pretrained models are publicly released for community use.

ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

Back to Blog

Related posts

Read more »