UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler

UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler,
Luigi Piccinelli, Christos Sakaridis, Yung-Hsu Yang, Mattia Segu, Siyuan Li, Wim Abbeloos, Luc Van Gool,
under submission,
Paper at arXiv 2502.20110

UniDepth: Universal Monocular Metric Depth Estimation

UniDepth: Universal Monocular Metric Depth Estimation,
Luigi Piccinelli, Yung-Hsu Yang, Christos Sakaridis, Mattia Segu, Siyuan Li, Luc Van Gool, Fisher Yu,
CVPR 2024,
Paper at arXiv 2403.18913

News and ToDo

HuggingFace/Gradio demo.
28.02.2025: Release UniDepthV2.
15.10.2024: Release training code.
02.04.2024: Release UniDepth as python package.
01.04.2024: Inference code and V1 models are released.
26.02.2024: UniDepth is accepted at CVPR 2024! (Highlight ⭐)

Zero-Shot Visualization

YouTube (The Office - Parkour)

NuScenes (stitched cameras)

Installation

Requirements are not in principle hard requirements, but there might be some differences (not tested):

Linux
Python 3.10+
CUDA 11.8

Install the environment needed to run UniDepth with:

export VENV_DIR=<YOUR-VENVS-DIR>
export NAME=Unidepth

python -m venv $VENV_DIR/$NAME
source $VENV_DIR/$NAME/bin/activate

# Install UniDepth and dependencies
pip install -e . --extra-index-url https://download.pytorch.org/whl/cu118

# Install Pillow-SIMD (Optional)
pip uninstall pillow
CC="cc -mavx2" pip install -U --force-reinstall pillow-simd

If you use conda, you should change the following:

python -m venv $VENV_DIR/$NAME -> conda create -n $NAME python=3.11
source $VENV_DIR/$NAME/bin/activate -> conda activate $NAME

Note: Make sure that your compilation CUDA version and runtime CUDA version match.
You can check the supported CUDA version for precompiled packages on the PyTorch website.

Note: xFormers may raise the the Runtime "error": Triton Error [CUDA]: device kernel image is invalid.
This is related to xFormers mismatching system-wide CUDA and CUDA shipped with torch.
It may considerably slow down inference.

Run UniDepth on the given assets to test your installation (you can check this script as guideline for further usage):

python ./scripts/demo.py

If everything runs correctly, demo.py should print: ARel: 7.45%.

If you encounter Segmentation Fault after running the demo, you may need to uninstall torch via pip (pip uninstall torch) and install the torch version present in requirements with conda.

Get Started

After installing the dependencies, you can load the pre-trained models easily from Hugging Face as follows:

from unidepth.models import UniDepthV1

model = UniDepthV1.from_pretrained("lpiccinelli/unidepth-v1-vitl14") # or "lpiccinelli/unidepth-v1-cnvnxtl" for the ConvNext backbone

Then you can generate the metric depth estimation and intrinsics prediction directly from RGB image only as follows:

import numpy as np
from PIL import Image

# Move to CUDA, if any
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

# Load the RGB image and the normalization will be taken care of by the model
rgb = torch.from_numpy(np.array(Image.open(image_path))).permute(2, 0, 1) # C, H, W

predictions = model.infer(rgb)

# Metric Depth Estimation
depth = predictions["depth"]

# Point Cloud in Camera Coordinate
xyz = predictions["points"]

# Intrinsics Prediction
intrinsics = predictions["intrinsics"]

You can use ground truth intrinsics as input to the model as well:

intrinsics_path = "assets/demo/intrinsics.npy"

# Load the intrinsics if available
intrinsics = torch.from_numpy(np.load(intrinsics_path)) # 3 x 3

predictions = model.infer(rgb, intrinsics)

To use the forward method for your custom training, you should:

Take care of the dataloading:
a) ImageNet-normalization
b) Long-edge based resizing (and padding) with input shape provided in image_shape under configs
c) BxCxHxW format
d) If any intriniscs given, adapt them accordingly to your resizing
Format the input data structure as:

data = {"image": rgb, "K": intrinsics}
predictions = model(data, {})

Model Zoo

The available models are the following:

Model	Backbone	Name
UnidepthV1	ConvNext-L	unidepth-v1-cnvnxtl
UnidepthV1	ViT-L	unidepth-v1-vitl14
UnidepthV2	ViT-S	unidepth-v2-vits14
	ViT-B	unidepth-v2-vits14
	ViT-L	unidepth-v2-vitl14

Please visit Hugging Face or click on the links above to access the repo models with weights. You can load UniDepth as the following, with name variable matching the table above:

from unidepth.models import UniDepthV1, UniDepthV2

model_v1 = UniDepthV1.from_pretrained(f"lpiccinelli/{name}")
model_v2 = UniDepthV2.from_pretrained(f"lpiccinelli/{name}")

In addition, we provide loading from TorchHub as:

version = "v2"
backbone = "vitl14"

model = torch.hub.load("lpiccinelli-eth/UniDepth", "UniDepth", version=version, backbone=backbone, pretrained=True, trust_repo=True, force_reload=True)

You can look into function UniDepth in hubconf.py to see how to instantiate the model from local file: provide a local path in line 34.

UniDepthV2

Visit UniDepthV2 ReadMe for a more detailed changelog. To summarize the main differences are:

Improved performance and edge sharpness. (EdgeGuidedLocalSSI)
Input shape and ratio flexibility. (self.resolution_level)
Confidence output.
Faster inference.
ONNX support.

Training

Please visit the training README for more information.

Results

Metric Depth Estimation

The performance reported is for UniDepthV1 model and the metrics is d1 (higher is better) on zero-shot evaluation. The common split between SUN-RGBD and NYUv2 is removed from SUN-RGBD validation set for evaluation.

Model	NYUv2	SUN-RGBD	ETH3D	Diode (In)	IBims-1	KITTI	Nuscenes	DDAD
iDisc	93.8	83.7	35.6	23.8	48.9	97.5	39.4	28.4
ZoeDepth	95.2	86.7	35.0	36.9	58.0	96.5	28.3	27.2
Metric3D	92.6	15.4	45.6	39.2	79.7	97.5	72.3	-
Metric3Dv2	98.9	81.2	90.0	-	68.4	98.5	84.1	-
DepthPro	-	83.1	39.7	-	82.3	-	56.6	29.9
UniDepthV1	98.4	94.3	18.5	77.1	15.7	98.6	84.6	85.8
UniDepthV2	98.8	96.4	85.2	-	94.5	98.9	87.0	88.2

Contributions

If you find any bug in the code, please report to Luigi Piccinelli ([email protected])

Citation

If you find our work useful in your research please consider citing our publications:

@inproceedings{piccinelli2024unidepth,
    title     = {{U}ni{D}epth: Universal Monocular Metric Depth Estimation},
    author    = {Piccinelli, Luigi and Yang, Yung-Hsu and Sakaridis, Christos and Segu, Mattia and Li, Siyuan and Van Gool, Luc and Yu, Fisher},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2024}
}

@misc{piccinelli2025unidepthv2,
      title={{U}ni{D}epth{V2}: Universal Monocular Metric Depth Estimation Made Simpler}, 
      author={Luigi Piccinelli and Christos Sakaridis and Yung-Hsu Yang and Mattia Segu and Siyuan Li and Wim Abbeloos and Luc Van Gool},
      year={2025},
      eprint={2502.20110},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2502.20110}, 
}

License

This software is released under Creatives Common BY-NC 4.0 license. You can view a license summary here.

Acknowledgement

We would like to express our gratitude to @niels for helping intergrating UniDepth in HuggingFace.

This work is funded by Toyota Motor Europe via the research project TRACE-Zurich (Toyota Research on Automated Cars Europe).

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
assets		assets
configs		configs
scripts		scripts
unidepth		unidepth
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
hubconf.py		hubconf.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler

UniDepth: Universal Monocular Metric Depth Estimation

News and ToDo

Zero-Shot Visualization

YouTube (The Office - Parkour)

NuScenes (stitched cameras)

Installation

Get Started

Model Zoo

UniDepthV2

Training

Results

Metric Depth Estimation

Contributions

Citation

License

Acknowledgement

About

Releases

Packages

Contributors 4

Languages

License

lpiccinelli-eth/UniDepth

Folders and files

Latest commit

History

Repository files navigation

UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler

UniDepth: Universal Monocular Metric Depth Estimation

News and ToDo

Zero-Shot Visualization

YouTube (The Office - Parkour)

NuScenes (stitched cameras)

Installation

Get Started

Model Zoo

UniDepthV2

Training

Results

Metric Depth Estimation

Contributions

Citation

License

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages