Official implementation of 'Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training'.
The paper has been accepted by NeurIPS 2022.
- We revise some bugs for pre-trianing and release the fine-tuning code of Point-M2AE 📌.
- Our latest work, I2P-MAE has been accepted by CVPR 2023 🔥 and open-sourced. I2P-MAE leverges 2D pre-trained models to guide the pre-training of Point-M2AE and achieves SOTA performance on various 3D tasks.
Comparison with existing MAE-based models for self-supervised 3D point cloud learning on ModelNet40 dataset:
Method | Parameters | GFlops | Extra Data | Linear SVM | Fine-tuning | Voting |
---|---|---|---|---|---|---|
Point-BERT | 22.1M | 4.8 | - | 87.4% | 92.7% | 93.2% |
ACT | 22.1M | 4.8 | 2D | - | - | 93.7% |
Point-MAE | 22.1M | 4.8 | - | 91.0% | 93.2% | 93.8% |
Point-M2AE | 12.9M | 3.6 | - | 92.9% | 93.4% | 94.0% |
I2P-MAE | 12.9M | 3.6 | 2D | 93.4% | 93.7% | 94.1% |
Point-M2AE is a strong Multi-scale MAE pre-training framework for hierarchical self-supervised learning of 3D point clouds. Unlike the standard transformer in MAE, we modify the encoder and decoder into pyramid architectures to progressively model spatial geometries and capture both fine-grained and high-level semantics of 3D shapes. We design a multi-scale masking strategy to generate consistent visible regions across scales, and reconstruct the masked coordinates from a global-to-local perspective.
Pre-trained by ShapeNet, Point-M2AE is evaluated by Linear SVM on ModelNet40 and ScanObjectNN (OBJ-BG split) datasets, without downstream fine-tuning:
Task | Dataset | Config | MN40 Acc. | OBJ-BG Acc. | Ckpts | Logs |
---|---|---|---|---|---|---|
Pre-training | ShapeNet | point-m2ae.yaml | 92.87% | 84.12% | pre-train.pth | log |
Synthetic shape classification on ModelNet40 with 1k points:
Task | Config | Acc. | Vote | Ckpts | Logs |
---|---|---|---|---|---|
Classification | modelnet40.yaml | 93.43% | 93.96% | modelnet40.pth | modelnet40.log |
Real-world shape classification on ScanObjectNN:
Task | Split | Config | Acc. | Ckpts | Logs |
---|---|---|---|---|---|
Classification | PB-T50-RS | scan_pb.yaml | 86.43% | scan_pd.pth | scan_pd.log |
Classification | OBJ-BG | scan_obj-bg.yaml | 91.22% | scan_obj-bg.pth | scan_obj-pd.log |
Classification | OBJ-ONLY | scan_obj.yaml | 88.81% | scan_obj.pth | scan_obj.log |
Part segmentation on ShapeNetPart:
Task | Dataset | Config | mIoUc | mIoUi | Ckpts | Logs |
---|---|---|---|---|---|---|
Segmentation | ShapeNetPart | segmentation | 84.86% | 86.51% | seg.pth | seg.log |
Few-shot classification on ModelNet40:
Task | Dataset | Config | 5w10s | 5w20s | 10w10s | 10w20s |
---|---|---|---|---|---|---|
Few-shot Cls. | ModelNet40 | - | 96.8% | 98.3% | 92.3% | 95.0% |
Create a conda environment and install basic dependencies:
git clone https://github.com/ZrrSkywalker/Point-M2AE.git
cd Point-M2AE
conda create -n pointm2ae python=3.8
conda activate pointm2ae
# Install the according versions of torch and torchvision
conda install pytorch torchvision cudatoolkit
# e.g., conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3
pip install -r requirements.txt
#There might be an issue with installing open3d==0.9.0 with python=3.8
#My workaround was to use the next version up
pip install open3d==0.10.0
Install GPU-related packages:
# Chamfer Distance and EMD
cd ./extensions/chamfer_dist
python setup.py install --user
cd ../emd
python setup.py install --user
# PointNet++
pip install "git+https://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
# GPU kNN
pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl
For pre-training and fine-tuning, please follow DATASET.md to install ShapeNet, ModelNet40, ScanObjectNN, and ShapeNetPart datasets, referring to Point-BERT. Specially for Linear SVM evaluation, download the official ModelNet40 dataset and put the unzip folder under data/
.
The final directory structure should be:
│Point-M2AE/
├──cfgs/
├──datasets/
├──data/
│ ├──ModelNet/
│ ├──ModelNetFewshot/
│ ├──modelnet40_ply_hdf5_2048/ # Specially for Linear SVM
│ ├──ScanObjectNN/
│ ├──ShapeNet55-34/
│ ├──shapenetcore_partanno_segmentation_benchmark_v0_normal/
├──...
Point-M2AE is pre-trained on ShapeNet dataset with the config file cfgs/pre-training/point-m2ae.yaml
. Run:
CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/pre-training/point-m2ae.yaml --exp_name pre-train
To evaluate the pre-trained Point-M2AE by Linear SVM, create a folder ckpts/
and download the pre-train.pth into it. Use the configs in cfgs/linear-svm/
and indicate the evaluation dataset by --test_svm
.
For ModelNet40, run:
CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/linear-svm/modelnet40.yaml --test_svm modelnet40 --exp_name test_svm --ckpts ./ckpts/pre-train.pth
For ScanObjectNN (OBJ-BG split), run:
CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/linear-svm/scan_obj-bg.yaml --test_svm scan --exp_name test_svm --ckpts ./ckpts/pre-train.pth
Please create a folder ckpts/
and download the pre-train.pth into it. The fine-tuning configs are in cfgs/fine-tuning/
.
For ModelNet40, run:
CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/modelnet40.yaml --finetune_model --exp_name finetune --ckpts ckpts/pre-train.pth
For the three splits of ScanObjectNN, run:
CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/scan_pb.yaml --finetune_model --exp_name finetune --ckpts ckpts/pre-train.pth
CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/scan_obj.yaml --finetune_model --exp_name finetune --ckpts ckpts/pre-train.pth
CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/scan_obj-bg.yaml --finetune_model --exp_name finetune --ckpts ckpts/pre-train.pth
For ShapeNetPart, first into the segmentation/
folder, and run:
cd segmentation
CUDA_VISIBLE_DEVICES=0 python main.py --model Point_M2AE_SEG --log_dir finetune --ckpts ./ckpts/pre-train.pth
Please download the pre-trained models from here and put them into the folder ckpts/
.
For ModelNet40 without voting, run:
CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/modelnet40.yaml --test --exp_name finetune --ckpts ckpts/modelnet.pth
For ModelNet40 with voting, run:
CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/modelnet40.yaml --test --vote --exp_name finetune_vote --ckpts ckpts/modelnet.pth
For the three splits of ScanObjectNN, run:
CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/scan_pb.yaml --test --exp_name finetune --ckpts ckpts/scan_pb.pth
CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/scan_obj.yaml --test --exp_name finetune --ckpts ckpts/scan_obj.pth
CUDA_VISIBLE_DEVICES=0 python main.py --config cfgs/fine-tuning/scan_obj-bg.yaml --test --exp_name finetune --ckpts ckpts/scan_obj-bg.pth
This repo benefits from Point-BERT and Point-MAE. Thanks for their wonderful works.
@article{zhang2022point,
title={Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training},
author={Zhang, Renrui and Guo, Ziyu and Gao, Peng and Fang, Rongyao and Zhao, Bin and Wang, Dong and Qiao, Yu and Li, Hongsheng},
journal={arXiv preprint arXiv:2205.14401},
year={2022}
}
If you have any question about this project, please feel free to contact [email protected].