GitHub - songlin/d3roma: A diffusion model-based stereo depth estimation framework that can predict state-of-the-art depth and restore noisy depth maps for transparent and specular surfaces

D³RoMa: Disparity Diffusion-based Depth Sensing for Material-Agnostic Robotic Manipulation
CoRL 2024, Munich, Germany.

This is the official repository of D3RoMa: Disparity Diffusion-based Depth Sensing for Material-Agnostic Robotic Manipulation.

For more information, please visit our project page.

Songlin Wei, Haoran Geng, Jiayi Chen, Congyue Deng, Wenbo Cui, Chengyang Zhao Xiaomeng Fang Leonidas Guibas He Wang

💡 Updates (Dec 14, 2024)

We just release new model variant (Cond. on RGB+Raw), please checkout the updated inference.py
Traning protocols and dataset

Our method robustly predicts transparent (bottles) and specular (basin and cups) object depths in tabletop environments and beyond.

INSTLLATION

conda create --name d3roma python=3.8
conda activate d3roma

# install dependencies with pip
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
pip install huggingface_hub==0.24.5
pip install diffusers opencv-python scikit-image matplotlib transformers datasets accelerate tensorboard imageio open3d kornia
pip install hydra-core --upgrade

DOWNLOAD PRE-TRAINED WEIGHT

For model variant: Cond. Left+Right+Raw Google drive, 百度云
For model variant: Cond. RGB+Raw Google drive, 百度云

# Download pretrained weigths from Google Drive
# Extract it under the project folder

RUN INFERENCE

You can run the following script to test our model:

python inference.py

This will generate three files under folder _output:

_outputs/pred.png: the pseudo colored depth map

_outputs/pred.ply: the pointcloud which ia obtained though back-projected the predicted depth

_outputs/raw.ply: the pointcloud which ia obtained though back-projected the camera raw depth

Training Protocols & Dataset (Comming Soon)

Contact

If you have any questions please contact us:

Songlin Wei: [email protected], Haoran Geng: [email protected], He Wang: [email protected]

Citation

@inproceedings{
  wei2024droma,
  title={D3RoMa: Disparity Diffusion-based Depth Sensing for Material-Agnostic Robotic Manipulation},
  author={Songlin Wei and Haoran Geng and Jiayi Chen and Congyue Deng and Cui Wenbo and Chengyang Zhao and Xiaomeng Fang and Leonidas Guibas and He Wang},
  booktitle={8th Annual Conference on Robot Learning},
  year={2024},
  url={https://openreview.net/forum?id=7E3JAys1xO}
}

License

This work and the dataset are licensed under CC BY-NC 4.0.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
conf		conf
core		core
data		data
utils		utils
.gitignore		.gitignore
README.md		README.md
config.py		config.py
distributed_evaluate.py		distributed_evaluate.py
evaluate.py		evaluate.py
inference.py		inference.py
pyrightconfig.json		pyrightconfig.json
raw_aligned.png		raw_aligned.png
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

D³RoMa: Disparity Diffusion-based Depth Sensing for Material-Agnostic Robotic Manipulation
CoRL 2024, Munich, Germany.

💡 Updates (Dec 14, 2024)

INSTLLATION

DOWNLOAD PRE-TRAINED WEIGHT

RUN INFERENCE

Training Protocols & Dataset (Comming Soon)

Contact

Citation

License

About

Releases

Packages

Languages

songlin/d3roma

Folders and files

Latest commit

History

Repository files navigation

D3RoMa: Disparity Diffusion-based Depth Sensing for Material-Agnostic Robotic Manipulation CoRL 2024, Munich, Germany.

💡 Updates (Dec 14, 2024)

INSTLLATION

DOWNLOAD PRE-TRAINED WEIGHT

RUN INFERENCE

Training Protocols & Dataset (Comming Soon)

Contact

Citation

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

D³RoMa: Disparity Diffusion-based Depth Sensing for Material-Agnostic Robotic Manipulation
CoRL 2024, Munich, Germany.

Packages