Basically, we take the optitrack which tracks markers, and we need to render the models in those poses, and then collect GT semantic segmentation data like that.
https://docs.google.com/spreadsheets/d/1weyPvCyxU82EIokMlGhlK5uEbNt9b8a-54ziPpeGjRo/edit?usp=sharing
Final dataset output:
objects
foldercameras
folderscenes
folder certain data:scenes/<scene number>/data/
folderscenes/<scene number>/scene_meta.yaml
metadata
toolbox
folder
-
Setup
- Place ARUCO marker near origin (doesn't actually matter where it is anymore, but makes sense to be near opti origin)
- Calibrate Opti (if you want, don't need to do this everytime, or else extrinsic changes)
- Place a single marker in the center of the aruco marker, use this to compute the aruco -> opti transform
- Place the marker position into
calculate_extrinsic/aruco_marker.txt
- Place the marker position into
- Place markers on the corners of the aruco marker, use this to compute the aruco -> opti transform as well
- Place marker positions into
calculate_extrinsic/aruco_corners.yaml
- Place marker positions into
-
Record Data (
tools/capture_data.py
)- ARUCO Calibration
- Data collection
- If extrinsic scene, data collection phase should be spent observing ARUCO marker
- Example:
python tools/capture_data.py --scene_name official_test az_camera
-
Check if the Extrinsic file exists
- If Extrinsic file doesn't exist, then you need to calculate Extrinsic through Step 4
- Otherwise, process data through Step 5 to generate groundtruth labels
-
Process Extrinsic Data to Calculate Extrinsic (If extrinsic scene)
- Clean raw opti poses (
tools/process_data.py
) - Sync opti poses with frames (
tools/process_data.py
) - Calculate camera extrinsic (
tools/calculate_camera_extrinsic.py
)
- Clean raw opti poses (
-
Process Data (If real scene)
- Clean raw opti poses (
tools/process_data.py
)
Example:python tools/process_data.py --scene_name [SCENE_NAME]
- Sync opti poses with frames (
tools/process_data.py
)
Example:python tools/process_data.py --scene_name [SCENE_NAME]
- Manually annotate first frame object poses (
tools/manual_annotate_poses.py
)- Modify (
[SCENE_NAME]/scene_meta.yml
) by adding (objects
) field to the file according to objects and their corresponding ids.
Example:python tools/manual_annotate_poses.py official_test
- Modify (
- Recover all frame object poses and verify correctness (
tools/generate_scene_labeling.py
)
Example:python tools/generate_scene_labeling.py --fast [SCENE_NAME]
- Generate semantic labeling (
tools/generate_scene_labeling.py
)
Example:python /tools/generate_scene_labeling.py [SCENE_NAME]
- Generate per frame object poses (
tools/generate_scene_labeling.py
)
Example:python tools/generate_scene_labeling.py [SCENE_NAME]
- Clean raw opti poses (
- Extrinsic scenes have their color images inside of
data
stored aspng
. This is to maximize performance. Data scenes have their color images inside ofdata
stored asjpg
. This is necessary so the dataset remains usable. - iPhone spits out
jpg
raw color images, while Azure Kinect skips outpng
raw color images.
Select SOTA pose estimation & image segmentation models and perform evaluations according to certain metrics.