Skip to content

Commit

Permalink
init
Browse files Browse the repository at this point in the history
  • Loading branch information
apchenstu committed Jul 9, 2024
0 parents commit 825856d
Show file tree
Hide file tree
Showing 164 changed files with 25,678 additions and 0 deletions.
15 changes: 15 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
__pycache__/
.idea/
.ipynb_checkpoints/
*.py[cod]
*.so
*.orig
*.o
*.json
*.pth
*.npy
*.ipynb
*.png
logs/*
outputs/*
ckpts/*
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "third_party/diff-surfel-rasterization"]
path = third_party/diff-surfel-rasterization
url = [email protected]:hbb1/diff-surfel-rasterization.git
94 changes: 94 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# LaRa: Efficient Large-Baseline Radiance Fields

[Project page](https://apchenstu.github.io/LaRa/) | [Paper](https://arxiv.org/abs/2407.04699) | [Data](https://huggingface.co/apchen/LaRa/tree/main/dataset) | [Checkpoint](https://huggingface.co/apchen/LaRa/tree/main/ckpts) |<br>

![Teaser image](assets/demo.gif)

## ⭐ New Features
- 2024/04/05: Important updates -
Now our method supports half precision training, achieving over **100% faster** convergence and about **1.5dB** gains with less iterations!

| Model | PSNR ↑ | SSIM ↑ | Abs err (Geo) ↓ | Epoch | Time(day) | ckpt |
| ------ | ------ | ------ | ------ | ------ | ------ | ------ |
| Paper | 27.65 | 0.951 | 0.0654 | 50 | 3.5 | ------ |
| bf16 | 29.15 | 0.956 | 0.0574 | 30 | 1.5 | [Download](https://huggingface.co/apchen/LaRa/tree/main/ckpts/) |

Please download the pre-trained checkpoint from the provided link and place it in the `ckpts` folder.

# Installation

```
git clone https://github.com/autonomousvision/LaRa.git --recursive
conda env create --file environment.yml
conda activate lara
```


# Dataset
We used the processed [gobjaverse dataset](https://aigc3d.github.io/gobjaverse/) for training. A download script `tools/download_dataset.py` is provided to automatically download the datasets.

```
python tools/download_dataset.py all
```
Note: The GObjaverse dataset requires approximately 1.4 TB of storage. You can also download a subset of the dataset. Please refer to the provided script for details. Please manually delete the `_temp` folder after completing the download.

If you would like to process the data by yourself, we provide preprocess scripts for the gobjaverse and co3d datasets, please check `tools/prepare_dataset_*`.
You can also download our preprocessed data and put them to `dataset` folder:
* [gobjaverse](#gobjaverse)
* [Google Scaned Object](#GSO)
* [Co3D](#Co3D)
* Instant3D - Please contact the authors of Instant3D if you wish to obtain the data for comparison.

# Training
```
python train_lightning.py
```
**note:** You can configure the GPU id and other parameter with `configs/base.yaml`.

# Evaluation
Our method supports the reconstruction of radiance fields from **multi-view**, **text**, and **single view** inputs. We provide a pre-trained checkpoint at [ckpt](#https://huggingface.co/apchen/LaRa/tree/main/ckpts).

## multi-view to 3D
To reproduce the table results, you can simply use:
```
python eval_all.py
```
**note:** Please double-check that the paths inside the script are correct for your specific case.

## text to 3D
```
python evaluation.py configs/infer.yaml
infer.ckpt_path=ckpts/lara.ckpt
infer.save_folder=outputs/prompts/
infer.dataset.generator_type=instant3d
infer.dataset.prompts=["a car made out of sushi","a beautiful rainbow fish"]
```
**note:** You can specify multiply prompts.

## single view to 3D
```
python evaluation.py configs/infer.yaml
infer.ckpt_path=ckpts/lara.ckpt
infer.save_folder=outputs/single-view/
infer.dataset.generator_type="zero123plus-v1.1"
infer.dataset.image_pathes=\["assets/examples/19_dalle3_stump1.png"\]
```
**note:** It supports the generator types `zero123plus-v1.1` and `zero123plus-v1`.



## Acknowledgements
Our render is built upon [2DGS](https://github.com/hbb1/2d-gaussian-splatting). The data preprocessing code for the Co3D dataset is partially borrowed from [Splatter-Image](https://github.com/szymanowiczs/splatter-image/blob/main/data_preprocessing/preprocess_co3d.py). Additionally, the script for generating multi-view images from text and single view image is sourced from [GRM](https://github.com/justimyhxu/grm). We thank all the authors for their great repos.

## Citation
If you find our code or paper helps, please consider citing:
```bibtex
@inproceedings{LaRa,
author = {Anpei Chen and Haofei Xu and Stefano Esposito and Siyu Tang and Andreas Geiger},
title = {LaRa: Efficient Large-Baseline Radiance Fields},
booktitle = {European Conference on Computer Vision (ECCV)},
year = {2024}
}
```


Binary file added assets/demo.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
70 changes: 70 additions & 0 deletions configs/base.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
gpu_id: [4,5,6,7]

exp_name: LaRa/release-test
n_views: 4

model:

encoder_backbone: 'vit_base_patch16_224.dino' # ['vit_small_patch16_224.dino','vit_base_patch16_224.dino']

n_groups: [16] # n_groups for local attention
n_offset_groups: 32 # offset radius of 1/n_offset_groups of the scene size

K: 2 # primitives per-voxel
sh_degree: 1 # view dependent color

num_layers: 12
num_heads: 16

view_embed_dim: 32
embedding_dim: 256

vol_feat_reso: 16
vol_embedding_reso: 32

vol_embedding_out_dim: 80

ckpt_path: null # specify a ckpt path if you want to continue training

train_dataset:
dataset_name: gobjeverse
data_root: dataset/gobjaverse/gobjaverse.h5

split: train
img_size: [512,512] # image resolution
n_group: ${n_views} # image resolution
n_scenes: 3000000
load_normal: True



test_dataset:
dataset_name: gobjeverse
data_root: dataset/gobjaverse/gobjaverse.h5

split: test
img_size: [512,512]
n_group: ${n_views}
n_scenes: 3000000
load_normal: True

train:
batch_size: 3
lr: 4e-4
beta1: 0.9
beta2: 0.95
weight_decay: 0.05
# betas: [0.9, 0.95]
warmup_iters: 1000
n_epoch: 30
limit_train_batches: 0.2
limit_val_batches: 0.02
check_val_every_n_epoch: 1
start_fine: 5000
use_rand_views: False
test:
batch_size: 3

logger:
name: tensorboard
dir: logs/${exp_name}
65 changes: 65 additions & 0 deletions configs/infer.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
n_views: 4

infer:
dataset:
# dataset_name: gobjeverse
# data_root: dataset/gobjaverse_280k/gobjaverse_280k.hdf5
# data_root: dataset/Co3D/co3d_teddybear.hdf5
# data_root: dataset/Co3D/co3d_hydrant.hdf5

# dataset_name: GSO
# data_root: dataset/google_scanned_objects

# dataset_name: instant3d
# data_root: dataset/instant3D

# text to 3D
dataset_name: mvgen
generator_type: instant3d
prompts: ["a car made out of sushi"]
image_pathes: []

## single view to 3D
# dataset_name: mvgen
# generator_type: zero123plus-v1.1 # zero123plus-v1.1,zero123plus-v1.2,sv3d
# prompts: []
# image_pathes: ['examples/19_dalle3_stump1.png']

# # unposed inputs
# dataset_name: unposed
# image_pathes: examples/unposed/*.png

split: test
img_size: [512,512]
n_group: 4
n_scenes: 30000
num_workers: 0
batch_size: 1

load_normal: False

ckpt_path: ckpts/lara.ckpt

eval_novel_view_only: True
eval_depth: []
metric_path: None

save_folder: outputs/video_vis/mvgen
video_frames: 120
mesh_video_frames: 0

save_mesh: True
aabb: [-0.5,-0.5,-0.5,0.5,0.5,0.5]

finetuning:
with_ft: False
steps: 500

# lr
position_lr: 0.000016
feature_lr: 0.0025
opacity_lr: 0.05
scaling_lr: 0.005
rotation_lr: 0.001


Binary file added configs/render/cathedral.hdr
Binary file not shown.
17 changes: 17 additions & 0 deletions configs/render/cathedral.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
<scene version="2.1.0">
<emitter type="envmap">
<string name="filename" value="configs/render/cathedral.hdr" />
<float name="scale" value="0.5"/>
<transform name="to_world">
<rotate x="0.5" y="-1.5" z="0.5" angle="20"/>
</transform>
</emitter>

<!-- Point Light -->
<emitter type="point">
<point name="position" x="-2" y="-2" z="-3"/>
<spectrum name="intensity" value="50"/>
</emitter>


</scene>
41 changes: 41 additions & 0 deletions configs/render/common.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
<scene version="2.1.0">
<default name="spp" value="32"/>
<default name="resx" value="64"/>
<default name="resy" value="64"/>
<default name="disable_edge_gradient" value="false"/>
<default name="disable_shading_gradient" value="false"/>
<default name="edge_epsilon" value="0.01"/>
<default name="legacy_mode" value="false"/>
<default name="max_depth" value="3"/>
<default name="refined_intersection" value="false"/>
<default name="integrator" value="path"/>
<default name="integrator_file" value="integrator_sdf.xml"/>
<default name="pixel_format" value="rgb"/>
<default name="sample_border" value="true"/>
<default name="use_aovs" value="false"/>
<default name="pixel_filter" value="gaussian"/>
<default name="sdf_filename" value=""/>
<default name="sensors_filename" value="sensors.xml"/>
<default name="use_mis" value="false"/>
<default name="hide_emitters" value="false"/>
<default name="use_antithetic_sampling" value="false"/>
<default name="detach_weight_sum" value="false"/>
<default name="decouple_reparam" value="false"/>
<default name="detach_indirect_si" value="false"/>
<default name="reparam_primary_only" value="false"/>

<include filename="$integrator_file"/>
<include filename="$sensors_filename"/>

<bsdf type="twosided" id="default-bsdf">
<bsdf type="diffuse">
<rgb name="reflectance" value="0.5, 0.5, 0.5"/>
</bsdf>
</bsdf>

<bsdf type="twosided" id="red-bsdf">
<bsdf type="diffuse">
<rgb name="reflectance" value="0.8, 0.2, 0.2"/>
</bsdf>
</bsdf>
</scene>
7 changes: 7 additions & 0 deletions configs/render/integrator_path.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
<scene version="2.1.0">
<integrator type="path">
<integer name="max_depth" value="$max_depth"/>
<boolean name="hide_emitters" value="true"/>
</integrator>
</scene>

26 changes: 26 additions & 0 deletions configs/render/scene.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
<scene version="2.1.0">
<!-- <path value="../../"/> -->
<include filename="configs/render/common.xml"/>
<default name="emitter_scene" value="configs/render/cathedral.xml"/>
<include filename="$emitter_scene"/>

<!-- <bsdf type="thindielectric" id="plastic_bsdf">
<rgb name="specular_transmittance" value="0.7, 0.8, 0.95"/>
</bsdf> -->

<bsdf type="roughplastic" id="plastic_bsdf">
<float name="alpha" value="0.1" />
<rgb name="diffuse_reflectance" value="0.25, 0.5, 0.8"/>
</bsdf>

<default name="main_bsdf_name" value="plastic_bsdf"/>

<shape type="$mesh_type">
<ref id="$main_bsdf_name" name="bsdf"/>
<string name="filename" value="$mesh_path"/>
<transform name="to_world">
<translate x="0.0" y="0.0" z="0.0"/>
</transform>
</shape>

</scene>
Loading

0 comments on commit 825856d

Please sign in to comment.