Skip to content
/ DynOMo Public

Official code of DynOMo: Online Point Tracking by Dynamic Online Monocular Gaussian Reconstruction (3DV 2025))

License

Notifications You must be signed in to change notification settings

dvl-tum/DynOMo

Repository files navigation

DynOMo: Online Point Tracking by Dynamic Online Monocular Gaussian Reconstruction

Jenny Seidenschwarz · Qunjie Zhou · Bardenius Duisterhof · Deva Ramanan · Laura Leal-Taixe

Logo


This repository contains the official code of the 3DV 2025 paper "DynOMo: Online Point Tracking by Dynamic Online Monocular Gaussian Reconstruction".

Table of Contents
  1. Installation
  2. Downloads
  3. Usage
  4. Acknowledgement
  5. Citation
  6. Developers

Installation

We provide a conda environment file to create our environment. Please run the following to install all necessary dependencies including the rasterizer.

# create conda environment and install rasterizer
bash scripts/create_env.sh

Downloads

We provide a script to download and preprocess TAP-VID Davis, Panoptic Sport as well as the iPhone dataset. Additionally, you can pre-compute DepthAnything depth maps as well as the DINO embeddings. However, we also provide the option to predict the depth maps and embeddings during optimization. Pleas use our script as follows:

# base command
python scripts/prepare_data.py <DATASET> --download --embeddings --embedding_model <EMBEDDING_MODEL> --depth --depth_model <DEPTH_MODEL>

where the flags mean the following

  • <DATASET>: choose either davis, panoptic_sport, or iphone
  • download: will lead to downloading the data
  • embeddings: will lead to pre-computing the embeddings
  • embedding_model <EMBEDDING_MODEL>: determined the DINO version, i.e., either dinov2_vits14_reg or dinov2_vits14
  • depth: will lead to pre-computing depth predictions
  • depth_model <DEPTH_MODEL>: will determine the depth model, i.e., either DepthAnything or DepthAnythingV2-vitl

To preprocess the data the same way as we did, please run the following:

# Download and prepare davis dataset
python scripts/prepare_data.py davis --download --embeddings --embedding_model dinov2_vits14_reg --depth --depth_model DepthAnything

# Download and prepare panoptic sport dataset
python scripts/prepare_data.py panoptic_sport --download --embeddings --embedding_model dinov2_vits14_reg

# Download and prepare iphone dataset
python scripts/prepare_data.py iphone --download --embeddings  --embedding_model dinov2_vits14_reg

Pleas note, since we use depth predictions generated with Dynamic 3D Gaussians for Panoptic Sport as well as the depth predictions from Shape of Motion for the results in our paper we do not pre-compute depth maps here for both.

Usage

To run DynOMo, please run the run_dynomo.py script as follows:

# base command
python scripts/run_dynomo.py <CONFIG_FILE> --gpus <GPUS_TO_USE> 

where the flags are defined as follows:

  • <CONFIG_FILE>: one of the config files for the specific datasets, i.e., either config/davis/dynomo_davis.py, config/iphone/dynomo_iphone.py, or config/panoptic_sports/dynomo_panoptic_sports.py
  • gpus: the GPUs to use for the optimization as comma seperated list, e.g., 0,1,2,3,4,5,6,7

Additionally, to predict depth and embeddings online, add the following flags:

# base command with online depth and embedding computation
python scripts/run_dynomo.py <CONFIG_FILE> --gpus <GPUS_TO_USE> --online_depth <DEPTH_MODEL> --online_emb <EMBEDDING_MODEL>

where <DEPTH_MODEL> and <EMBEDDING_MODEL> are defined as:

  • online_depth <DEPTH_MODEL>: will determine the depth model, i.e., either DepthAnything or DepthAnythingV2-vitl
  • online_emb <EMBEDDING_MODEL>: determined the DINO version, i.e., either dinov2_vits14_reg or dinov2_vits14

Finally, for evaluation of an already optimized model please add the just_eval flag:

# base command with online depth and embedding computation
python scripts/run_dynomo.py <CONFIG_FILE> --gpus <GPUS_TO_USE> --just_eval

this will re-evaluate the trajectories and store visualizations of the tracked trajectories, a grid of tracked points from the foreground mask, as well as the online rendered training views to experiments/<DATASET>/<RUN_NAME>/<SEQUENCE>/eval.

Additionally, you can set the following flags for evaluation:

  • not_eval_renderings: will lead to not re-rendering the training views
  • not_eval_trajs: will lead to not evaluating the trajectories
  • not_vis_trajs: will lead to not visualize the tracked trajectories
  • not_vis_grid: will lead to not visualize a grid of tracked point trajectories
  • vis_bg_and_fg: will sample points from the foreground and background during grid visualization
  • vis_gt: will visualize all gt data, i.e., depth, embeddings, background mask, and rgb
  • vis_rendered: will visualize all rendered data, i.e., depth, embeddings, background mask, and rgb
  • novel_view_mode: will render views from a novel view, choose from zoom_out and circle
  • best_x <x>: will compute oracle results bei choosing the Gaussian from the best <x> Gaussians that fits the ground truth trajectory best
  • traj_len <l>: will change the length of the visualized trajectories

Acknowledgement

We thank the authors of the following repositories for their open-source code:

Citation

If you find our paper and code useful, please cite us:

@article{seidenschwarz2025dynomo,
  author       = {Jenny Seidenschwarz and Qunjie Zhou and Bardienus Pieter Duisterhof and Deva Ramanan and Laura Leal{-}Taix{\'{e}}},
  title        = {DynOMo: Online Point Tracking by Dynamic Online Monocular Gaussian Reconstruction},
  journal      = {3DV},
  year         = {2025},
}

About

Official code of DynOMo: Online Point Tracking by Dynamic Online Monocular Gaussian Reconstruction (3DV 2025))

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published