forked from jonashein/pvnet_baseline
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
61 changed files
with
1,931 additions
and
1,640 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,9 @@ | ||
__pycache__/ | ||
.cache/ | ||
.conda/ | ||
.idea/ | ||
.ipynb_checkpoints/ | ||
runs/ | ||
*.py[cod] | ||
*.so | ||
*.orig | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,280 +1,138 @@ | ||
# PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation | ||
# Towards Markerless Surgical Tool and Hand Pose Estimation: PVNet Baseline | ||
|
||
![introduction](./assets/introduction.png) | ||
|
||
> [PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation](https://arxiv.org/pdf/1812.11788.pdf) | ||
> Sida Peng, Yuan Liu, Qixing Huang, Xiaowei Zhou, Hujun Bao | ||
> CVPR 2019 oral | ||
> [Project Page](https://zju3dv.github.io/pvnet) | ||
Any questions or discussions are welcomed! | ||
|
||
## Introduction | ||
|
||
Thanks [Haotong Lin](https://github.com/haotongl) for providing the clean version of PVNet and reproducing the results. | ||
- [Project page](http://medicalaugmentedreality.org/handobject.html) | ||
<!-- - [Paper](http://arxiv.org/abs/2004.13449) --> | ||
|
||
The structure of this project is described in [project_structure.md](project_structure.md). | ||
|
||
## Installation | ||
|
||
One way is to set up the environment with docker. See [this](https://github.com/zju3dv/clean-pvnet/tree/master/docker). | ||
|
||
Thanks **Floris Gaisser** for providing the docker implementation. | ||
|
||
Another way is to use the following commands. | ||
|
||
1. Set up the python environment: | ||
``` | ||
conda create -n pvnet python=3.7 | ||
conda activate pvnet | ||
# install torch 1.1 built from cuda 9.0 | ||
pip install torch==1.1.0 -f https://download.pytorch.org/whl/cu90/stable | ||
pip install Cython==0.28.2 | ||
sudo apt-get install libglfw3-dev libglfw3 | ||
pip install -r requirements.txt | ||
``` | ||
2. Compile cuda extensions under `lib/csrc`: | ||
``` | ||
ROOT=/path/to/clean-pvnet | ||
cd $ROOT/lib/csrc | ||
export CUDA_HOME="/usr/local/cuda-9.0" | ||
cd dcn_v2 | ||
python setup.py build_ext --inplace | ||
cd ../ransac_voting | ||
python setup.py build_ext --inplace | ||
cd ../nn | ||
python setup.py build_ext --inplace | ||
cd ../fps | ||
python setup.py build_ext --inplace | ||
# If you want to use the uncertainty-driven PnP | ||
cd ../uncertainty_pnp | ||
sudo apt-get install libgoogle-glog-dev | ||
sudo apt-get install libsuitesparse-dev | ||
sudo apt-get install libatlas-base-dev | ||
python setup.py build_ext --inplace | ||
``` | ||
3. Set up datasets: | ||
``` | ||
ROOT=/path/to/clean-pvnet | ||
cd $ROOT/data | ||
ln -s /path/to/linemod linemod | ||
ln -s /path/to/linemod_orig linemod_orig | ||
ln -s /path/to/occlusion_linemod occlusion_linemod | ||
# the following is used for tless | ||
ln -s /path/to/tless tless | ||
ln -s /path/to/cache cache | ||
ln -s /path/to/SUN2012pascalformat sun | ||
``` | ||
Download datasets which are formatted for this project: | ||
1. [linemod](https://zjueducn-my.sharepoint.com/:u:/g/personal/pengsida_zju_edu_cn/EXK2K0B-QrNPi8MYLDFHdB8BQm9cWTxRGV9dQgauczkVYQ?e=beftUz) | ||
2. [linemod_orig](https://zjueducn-my.sharepoint.com/:u:/g/personal/pengsida_zju_edu_cn/EaoGIPguY3FAgrFKKhi32fcB_nrMcNRm8jVCZQd7G_-Wbg?e=ig4aHk): The dataset includes the depth for each image. | ||
3. [occlusion linemod](https://zjueducn-my.sharepoint.com/:u:/g/personal/pengsida_zju_edu_cn/ESXrP0zskd5IvvuvG3TXD-4BMgbDrHZ_bevurBrAcKE5Dg?e=r0EgoA) | ||
4. [truncation linemod](https://1drv.ms/u/s!AtZjYZ01QjphfuDICdni1IIM4SE): Check [TRUNCATION_LINEMOD.md](TRUNCATION_LINEMOD.md) for the information about the Truncation LINEMOD dataset. | ||
5. [Tless](https://zjueducn-my.sharepoint.com/:f:/g/personal/pengsida_zju_edu_cn/EsKEY3aHNElEjaKbhCJVyQgBUGTlprdcyF5sgLjEv8J8TQ?e=NbJpkM): `cat tlessa* | tar xvf - -C .`. | ||
6. [Tless cache data](https://zjueducn-my.sharepoint.com/:u:/g/personal/pengsida_zju_edu_cn/EWf-M5HRcH1JnBNN9yE1a84BYNAU7x1DoU_-W3Onl5Xxog?e=HZSrMu): It is used for training and testing on Tless. | ||
7. [SUN2012pascalformat](http://groups.csail.mit.edu/vision/SUN/releases/SUN2012pascalformat.tar.gz) | ||
## Testing | ||
## Table of Content | ||
|
||
### Testing on Linemod | ||
- [Setup](#setup) | ||
- [Demo](#demo) | ||
- [Training](#training) | ||
- [Evaluation](#evaluation) | ||
- [Visualization](#visualization) | ||
- [Citations](#citations) | ||
|
||
We provide the pretrained models of objects on Linemod, which can be found at [here](https://1drv.ms/f/s!AtZjYZ01QjphgQBQDQghxjbkik5f). | ||
## Setup | ||
|
||
Take the testing on `cat` as an example. | ||
1. Prepare the data related to `cat`: | ||
``` | ||
python run.py --type linemod cls_type cat | ||
``` | ||
2. Download the pretrained model of `cat` and put it to `$ROOT/data/model/pvnet/cat/199.pth`. | ||
3. Test: | ||
``` | ||
python run.py --type evaluate --cfg_file configs/linemod.yaml model cat cls_type cat | ||
python run.py --type evaluate --cfg_file configs/linemod.yaml test.dataset LinemodOccTest model cat cls_type cat | ||
``` | ||
4. Test with icp: | ||
``` | ||
python run.py --type evaluate --cfg_file configs/linemod.yaml model cat cls_type cat test.icp True | ||
python run.py --type evaluate --cfg_file configs/linemod.yaml test.dataset LinemodOccTest model cat cls_type cat test.icp True | ||
``` | ||
5. Test with the uncertainty-driven PnP: | ||
``` | ||
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:./lib/csrc/uncertainty_pnp/lib | ||
python run.py --type evaluate --cfg_file configs/linemod.yaml model cat cls_type cat test.un_pnp True | ||
python run.py --type evaluate --cfg_file configs/linemod.yaml test.dataset LinemodOccTest model cat cls_type cat test.un_pnp True | ||
``` | ||
### Testing on Tless | ||
We provide the pretrained models of objects on Tless, which can be found at [here](https://zjueducn-my.sharepoint.com/:u:/g/personal/pengsida_zju_edu_cn/EbcvcBH-eFJDm7lFqillf_oB8Afr2d6vtELNn0tUUk439g?e=bNZaDc). | ||
1. Download the pretrained models and put them to `$ROOT/data/model/pvnet/`. | ||
2. Test: | ||
``` | ||
python run.py --type evaluate --cfg_file configs/tless/tless_01.yaml | ||
# or | ||
python run.py --type evaluate --cfg_file configs/tless/tless_01.yaml test.vsd True | ||
``` | ||
## Visualization | ||
### Visualization on Linemod | ||
Take the `cat` as an example. | ||
1. Prepare the data related to `cat`: | ||
``` | ||
python run.py --type linemod cls_type cat | ||
``` | ||
2. Download the pretrained model of `cat` and put it to `$ROOT/data/model/pvnet/cat/199.pth`. | ||
3. Visualize: | ||
``` | ||
python run.py --type visualize --cfg_file configs/linemod.yaml model cat cls_type cat | ||
``` | ||
If setup correctly, the output will look like | ||
### Download and install prerequisites | ||
```sh | ||
sudo apt-get install libglfw3-dev libglfw3 | ||
``` | ||
|
||
![cat](./assets/cat.png) | ||
### Download and Install Code | ||
```sh | ||
git https://github.com/jonashein/pvnet_baseline.git | ||
cd pvnet_baseline | ||
conda env create --file=environment.yml | ||
conda activate pvnet | ||
``` | ||
|
||
4. Visualize with a detector: | ||
Compile cuda extension for RANSAC voting under `lib/csrc/ransac_voting`: | ||
```sh | ||
cd lib/csrc/ransac_voting/ | ||
python setup.py build_ext --inplace | ||
cd ../../../ | ||
``` | ||
|
||
Download the pretrained models [here](https://zjueducn-my.sharepoint.com/:u:/g/personal/haotongl_zju_edu_cn/EZxeOruBmGZLr8vldbB381ABo4cpI1VsE4HhqjizMw1Opw?e=OUGtql) and put them to `$ROOT/data/model/pvnet/pvnet_cat/59.pth` and `$ROOT/data/model/ct/ct_cat/9.pth` | ||
``` | ||
python run.py --type detector_pvnet --cfg_file configs/ct_linemod.yaml | ||
``` | ||
### Download Synthetic Dataset | ||
Download the synthetic dataset from the [project page](http://medicalaugmentedreality.org/handobject.html), | ||
or use the commands below: | ||
```sh | ||
cd data/ | ||
wget http://medicalaugmentedreality.org/datasets/syn_colibri_v1.zip | ||
unzip -x syn_colibri_v1.zip | ||
cd ../ | ||
``` | ||
|
||
### Visualization on Tless | ||
Convert the dataset into the format expected by PVNet: | ||
```sh | ||
python3 pvnet_custom_dataset.py -m assets/drill_segmentation_textured_final.ply -d data/syn_colibri_v1/train.txt -o data/ -n syn_colibri_v1_train | ||
python3 pvnet_custom_dataset.py -m assets/drill_segmentation_textured_final.ply -d data/syn_colibri_v1/val.txt -o data/ -n syn_colibri_v1_val | ||
python3 pvnet_custom_dataset.py -m assets/drill_segmentation_textured_final.ply -d data/syn_colibri_v1/test.txt -o data/ -n syn_colibri_v1_test | ||
``` | ||
|
||
Visualize: | ||
### Download Real Dataset | ||
Download the real dataset from the [project page](http://medicalaugmentedreality.org/handobject.html), | ||
or use the commands below: | ||
```sh | ||
cd data/ | ||
wget http://medicalaugmentedreality.org/datasets/real_colibri_v1.zip | ||
unzip -x real_colibri_v1.zip | ||
cd ../ | ||
``` | ||
python run.py --type visualize --cfg_file configs/tless/tless_01.yaml | ||
# or | ||
python run.py --type visualize --cfg_file configs/tless/tless_01.yaml test.det_gt True | ||
|
||
Convert the dataset into the format expected by PVNet: | ||
```sh | ||
python3 pvnet_custom_dataset.py -m assets/drill_segmentation_textured_final.ply -d data/real_colibri_v1/train.txt -o data/ -n real_colibri_v1_train | ||
python3 pvnet_custom_dataset.py -m assets/drill_segmentation_textured_final.ply -d data/real_colibri_v1/val.txt -o data/ -n real_colibri_v1_val | ||
python3 pvnet_custom_dataset.py -m assets/drill_segmentation_textured_final.ply -d data/real_colibri_v1/test.txt -o data/ -n real_colibri_v1_test | ||
``` | ||
|
||
## Training | ||
|
||
### Training on Linemod | ||
1. Prepare the data related to `cat`: | ||
``` | ||
python run.py --type linemod cls_type cat | ||
``` | ||
2. Train: | ||
``` | ||
python train_net.py --cfg_file configs/linemod.yaml model mycat cls_type cat | ||
``` | ||
The training parameters can be found in [project_structure.md](project_structure.md). | ||
### Training on Tless | ||
Train: | ||
Pretrain a model on the synthetic dataset: | ||
```sh | ||
python train_net.py --cfg_file configs/syn_colibri_v1_train.yaml | ||
``` | ||
python train_net.py --cfg_file configs/tless/tless_01.yaml | ||
|
||
Refine a model on the real dataset: | ||
```sh | ||
python train_net.py --cfg_file configs/real_colibri_v1_train.yaml | ||
``` | ||
The training checkpoints and monitoring data will be stored at `data/model/` and `data/record/` respectively. | ||
|
||
### Tensorboard | ||
|
||
``` | ||
Losses and validation metrics can are monitored on tensorboard: | ||
```sh | ||
tensorboard --logdir data/record/pvnet | ||
``` | ||
|
||
If setup correctly, the output will look like | ||
## Evaluation | ||
|
||
![tensorboard](./assets/tensorboard.png) | ||
Evaluate a pretrained model on the synthetic dataset: | ||
```sh | ||
python train_net.py --test --cfg_file configs/syn_colibri_v1_test.yaml | ||
``` | ||
|
||
Evaluate a refined model on the real dataset: | ||
```sh | ||
python train_net.py --test --cfg_file configs/real_colibri_v1_test.yaml | ||
``` | ||
|
||
## Training on the custom object | ||
After evaluating a model, the test set metrics can be computed by running: | ||
```sh | ||
python3 compute_metrics.py -m "data/record/metrics.pkl" | ||
``` | ||
|
||
An example dataset can be downloaded at [here](https://zjueducn-my.sharepoint.com/:u:/g/personal/pengsida_zju_edu_cn/Ec6Hd9j7z4lCiwDhqIwDcScBGPw2rsbn6FJh1C2FwbPJTw?e=xcKGAw). | ||
## Visualization | ||
|
||
1. Create a dataset using https://github.com/F2Wang/ObjectDatasetTools | ||
2. Organize the dataset as the following structure: | ||
``` | ||
├── /path/to/dataset | ||
│ ├── model.ply | ||
│ ├── camera.txt | ||
│ ├── diameter.txt // the object diameter, whose unit is meter | ||
│ ├── rgb/ | ||
│ │ ├── 0.jpg | ||
│ │ ├── ... | ||
│ │ ├── 1234.jpg | ||
│ │ ├── ... | ||
│ ├── mask/ | ||
│ │ ├── 0.png | ||
│ │ ├── ... | ||
│ │ ├── 1234.png | ||
│ │ ├── ... | ||
│ ├── pose/ | ||
│ │ ├── pose0.npy | ||
│ │ ├── ... | ||
│ │ ├── pose1234.npy | ||
│ │ ├── ... | ||
│ │ └── | ||
``` | ||
2. Create a soft link pointing to the dataset: | ||
``` | ||
ln -s /path/to/custom_dataset data/custom | ||
``` | ||
3. Process the dataset: | ||
``` | ||
python run.py --type custom | ||
``` | ||
4. Train: | ||
``` | ||
python train_net.py --cfg_file configs/custom.yaml train.batch_size 4 | ||
``` | ||
5. Watch the training curve: | ||
``` | ||
tensorboard --logdir data/record/pvnet | ||
``` | ||
6. Visualize: | ||
``` | ||
python run.py --type visualize --cfg_file configs/custom.yaml | ||
``` | ||
7. Test: | ||
``` | ||
python run.py --type evaluate --cfg_file configs/custom.yaml | ||
``` | ||
To visualize the keypoint estimates and render 3D views of the tool pose estimates, run: | ||
```sh | ||
python run.py --type visualize --test --cfg_file configs/real_colibri_v1_test.yaml --vis_out visualizations/ | ||
``` | ||
|
||
An example dataset can be downloaded at [here](https://zjueducn-my.sharepoint.com/:u:/g/personal/pengsida_zju_edu_cn/Ec6Hd9j7z4lCiwDhqIwDcScBGPw2rsbn6FJh1C2FwbPJTw?e=xcKGAw). | ||
## Citations | ||
|
||
## Citation | ||
If you find this code useful for your research, please consider citing: | ||
|
||
If you find this code useful for your research, please use the following BibTeX entry. | ||
* the publication that this code was adapted for | ||
``` | ||
@inproceedings{hein21_towards, | ||
title = {Towards Markerless Surgical Tool and Hand Pose Estimation}, | ||
author = {Hein, Jonas and Seibold, Matthias and Bogo, Federica and Farshad, Mazda and Pollefeys, Marc and Fürnstahl, Philipp and Navab, Nassir}, | ||
booktitle = {IPCAI}, | ||
year = {2021} | ||
} | ||
``` | ||
|
||
* the publication it builds upon and that this code was originally developed for | ||
``` | ||
@inproceedings{peng2019pvnet, | ||
title={PVNet: Pixel-wise Voting Network for 6DoF Pose Estimation}, | ||
author={Peng, Sida and Liu, Yuan and Huang, Qixing and Zhou, Xiaowei and Bao, Hujun}, | ||
booktitle={CVPR}, | ||
year={2019} | ||
} | ||
``` | ||
## Acknowledgement | ||
This work is affliated with ZJU-SenseTime Joint Lab of 3D Vision, and its intellectual property belongs to SenseTime Group Ltd. | ||
``` | ||
Copyright (c) ZJU-SenseTime Joint Lab of 3D Vision. All Rights Reserved. | ||
|
||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
|
||
http://www.apache.org/licenses/LICENSE-2.0 | ||
|
||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
``` | ||
``` |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Oops, something went wrong.