Depth estimation is the task of measuring the distance of each pixel relative to the camera. This repo provides a TensorRT implementation of the Depth-Anything depth estimation model in both C++ and Python, enabling efficient real-time inference.
- 2024-06-20: Added support for TensorRT 10.
- 2024-06-17: Depth Anything V2 has been integrated.
- 2024-01-23: The Depth Anything TensorRT version has been created.
The inference time includes the pre-preprocessing and post-processing stages:
Device | Model | Model Input (WxH) | Image Resolution (WxH) | Inference Time(ms) |
---|---|---|---|---|
RTX4090 | Depth-Anything-S | 518x518 | 1280x720 | 3 |
RTX4090 | Depth-Anything-B | 518x518 | 1280x720 | 6 |
RTX4090 | Depth-Anything-L | 518x518 | 1280x720 | 12 |
Note
Inference was conducted using FP16
precision, with a warm-up period of 10 frames. The reported time corresponds to the last inference.
- Usage 1: Create an engine from an onnx model and save it:
depth-anything-tensorrt.exe <onnx model> <input image or video>
- Usage 2: Deserialize an engine. Once you've built your engine, the next time you run it, simply use your engine file:
depth-anything-tensorrt.exe <engine> <input image or video>
Example:
# infer image
depth-anything-tensorrt.exe depth_anything_vitb14.engine test.jpg
# infer folder(images)
depth-anything-tensorrt.exe depth_anything_vitb14.engine data
# infer video
depth-anything-tensorrt.exe depth_anything_vitb14.engine test.mp4 # the video path
cd depth-anything-tensorrt/python
# infer image
python trt_infer.py --engine <path to trt engine> --img <single-img> --outdir <outdir> [--grayscale]
Refer to our docs/INSTALL.md for C++ environment installation.
cd <tensorrt installation path>/python
pip install cuda-python
pip install tensorrt-8.6.0-cp310-none-win_amd64.whl
pip install opencv-python
Perform the following steps to create an onnx model:
-
Download the pretrained model and install Depth-Anything:
git clone https://github.com/LiheYoung/Depth-Anything cd Depth-Anything pip install -r requirements.txt
-
Copy and paste all files in depth-anything_v1 to
<depth_anything_installpath>/depth_anything
folder. Note that I've only removed a squeeze operation at the end of model's forward function indpt.py
to avoid conflicts with TensorRT. -
Export the model to onnx format using export.py. You will get an onnx file named
depth_anything_vit{}14.onnx
, such asdepth_anything_vitb14.onnx
. Note that I used torch cpu version for exporting the onnx model as it is not necessary to deploy the model on GPU when exporting.conda create -n depth-anything python=3.8 conda activate depth-anything pip install torch torchvision pip install opencv-python pip install onnx python export_v1.py --encoder vitb --load_from depth_anything_vitb14.pth --image_shape 3 518 518
- Clone Depth-Anything-V2
git clone https://github.com/DepthAnything/Depth-Anything-V2.git cd Depth-Anything-v2 pip install -r requirements.txt
- Download the pretrained models from the readme and put them in checkpoints folder:
- Copy and paste all files in depth_anything_v2 to
<depth_anything_installpath>/depth_anything_v2
folder. - Run the following to export the model:
conda create -n depth-anything python=3.8 conda activate depth-anything pip install torch torchvision pip install opencv-python pip install onnx python export_v2.py --encoder vitb --input-size 518
Tip
The width and height of the model input should be divisible by 14, the patch height.
This project is based on the following projects:
- Depth-Anything - Unleashing the Power of Large-Scale Unlabeled Data.
- TensorRT - TensorRT samples and api documentation.