Unofficial implementation of NeRF-W (NeRF in the wild) using pytorch (pytorch-lightning). I tried to reproduce the results on the lego dataset (Section D). Training on real images (as the main content of the paper) cannot be realized since the authors didn't provide the data.
- OS: Ubuntu 18.04
- NVIDIA GPU with CUDA>=10.2 (tested with 1 RTX2080Ti)
- Clone this repo by
git clone https://github.com/kwea123/nerf_pl
- Python>=3.6 (installation via anaconda is recommended, use
conda create -n nerf_pl python=3.6
to create a conda environment and activate it byconda activate nerf_pl
) - Python libraries
- Install core requirements by
pip install -r requirements.txt
- Install core requirements by
Download nerf_synthetic.zip
from here
All random seeds are fixed to reproduce the same perturbations every time.
-
Color perturbations: Uses the same parameters in the paper.
-
Occlusions: The square has size 200x200 (should be the same as the paper), the position is randomly sampled inside the middle 400x400 area; the 10 colors are random.
Base:
python train.py \
--dataset_name blender \
--root_dir $BLENDER_DIR \
--N_importance 64 --img_wh 400 400 --noise_std 0 \
--num_epochs 20 --batch_size 1024 \
--optimizer adam --lr 5e-4 --lr_scheduler cosine \
--exp_name exp
Add --encode_a
for appearance embedding, --encode_t
for transient embedding.
Add --data_perturb color occ
to perturb the dataset.
Example:
python train.py \
--dataset_name blender \
--root_dir $BLENDER_DIR \
--N_importance 64 --img_wh 400 400 --noise_std 0 \
--num_epochs 20 --batch_size 1024 \
--optimizer adam --lr 5e-4 --lr_scheduler cosine \
--exp_name exp \
--data_perturb occ \
--encode_t --beta_min 0.1
To train NeRF-U on "occluders" (Table 3).
See opt.py for all configurations.
You can monitor the training process by tensorboard --logdir logs/
and go to localhost:6006
in your browser.
Download the pretrained models and training logs in release.
Example: test_nerfu_occ.ipynb shows how NeRF-U successfully decomposes the scene into static and transient components when the scene has random occluders.
Use eval.py to create the whole sequence of moving views. E.g.
python eval.py \
--root_dir $BLENDER \
--dataset_name blender --scene_name lego \
--img_wh 400 400 --N_importance 64 --ckpt_path $CKPT_PATH
It will create folder results/{dataset_name}/{scene_name}
and run inference on all test data, finally create a gif out of them.
Example of lego scene using pretrained NeRF-U model under occluder condition: (PSNR=28.60, paper=23.47)
-
Network structure (nerf.py):
- My base MLP uses 8 layers of 256 units as the original NeRF, while NeRF-W uses 512 units each.
- My static head uses 1 layer as the original NeRF, while NeRF-W uses 4 layers.
- I use softplus activation for sigma (reason explained here) while NeRF-W uses relu.
-
Training hyperparameters
- I find larger
beta_min
achieves better result, so my defaultbeta_min
is0.1
instead of0.03
in the paper. - I add 3 to
beta_loss
(equation 13) to make it positive empirically.
- I find larger
-
Evalutaion
- The evaluation metric is computed on the test set, while NeRF evaluates on val and test combined.