MonoDepth to ManyDepth: Self-Supervised Depth Estimation on Monocular Sequences

Dataset

Dense Depth for Autonomous Driving (DDAD)
KITTI Eigen Split

wget -i splits/kitti_archieves_to_download.txt -P kitti_data/
cd kitti_data/
unzip "*.zip"
cd ..
find kitti_data/ -name '*.png' | parallel 'convert -quality 92 -sampling-factor 2x2, 1x1, 1x1 {.}.png {.}.jpg && rm {}'

The above conversion command creates images with default chroma subsampling 2x2, 1x1, 1x1.

Problem Setting

while specialist hardware can give per-pixel depth, a more attractive approach is to only require a single RGB camera.

train a deep network to map from an input image to a depth map
Methods
- Geometry Models
  
  The simplest representation of a camera an image plane at a given position and orientation in space.
  
  The pinhole camera geometry models the camera with two sub-parameterizations, intrinsic and extrinsic paramters. Intrinsic parameters model the optic component (without distortion), and extrinsic model the camera position and orientation in space. This projection of the camera is described as:
  
  A 3D point is projected in a image with the following formula (homogeneous coordinates):
- Cross-View Reconstruction
frames the learning problem as one of novel view-synthesis, by training a network to predict the appearance of a target image from the viewpoint another image using depth (disparity)

formulate the problem as the minimization of a photometric reprojection error at training time

Here. pe is a photometric reconstruction error, proj() are the resulting 2D coordinates of the projected depths Dₜ in the source view and <> is the sampling operator. For simplicity of notation we assume the pre-comuted intrinsics K of all views are identical, though they can be different. α is set to 0.85.

consider the scene structure and camera motion at the same time, where camera pose estimation has a positive impact on monocular depth estimation. these two sub-networks are trained jointly, and the entire model is constrained by image reconstruction loss similar to stereo matching methods. formulate the problem as the minimization of a photometric reprojection error at training time formulate the problem as the minimization of a photometric reprojection error at training time
Folder

dataset/
    2011_09_26/
    ...
    ...
model_dataloader/
model_layer/
model_loss/
model_save/
model_test.py
model_train.py
model_parser.py
model_utility.py

Packages

apt-get update -y
apt-get install moreutils
or
apt-get install -y moreutils

Training

python model_train.py --pose_type separate --datatype kitti_eigen_zhou
python model_train.py --pose_type separate --datatype kitti_benchmark

Test

python model_test.py

evaluation

kitti_eigen_zhou 
abs_rel   sqrt_rel  rmse      rmse_log  a1        a2        a3
0.125     0.977     4.992     0.202     0.861     0.955     0.980

kitti_eigen_benchmark
abs_rel   sqrt_rel  rmse      rmse_log  a1        a2        a3
0.104     0.809     4.502     0.182     0.900     0.963     0.981

Padding

What is padding and why do we need it?

What is a feature map? that's the yellow block in the image.
It's a collection of N one-dimensional "maps" that each represent a particular "feature" that the model has spotted within the image.
why convolutional layers are known as feature extractors
How do we get from input (whether image or feature map) to a feature map?
through kernels or filters
you configure some number N per convolutional layer
"slide"(convolve) over your input data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MonoDepth to ManyDepth: Self-Supervised Depth Estimation on Monocular Sequences

Padding

What is padding and why do we need it?

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
docs		docs
model_dataloader		model_dataloader
model_layer		model_layer
model_loss		model_loss
samples		samples
splits		splits
utils		utils
.gitignore		.gitignore
README.md		README.md
imgcat		imgcat
model_parser.py		model_parser.py
model_test.py		model_test.py
model_train.py		model_train.py
model_utility.py		model_utility.py
requirements.txt		requirements.txt

fcntes/MonoDepth-to-ManyDepth

Folders and files

Latest commit

History

Repository files navigation

MonoDepth to ManyDepth: Self-Supervised Depth Estimation on Monocular Sequences

Padding

What is padding and why do we need it?

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages