Skip to content

Latest commit

 

History

History

track2

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

DFC 2023 Track 2

Introduction

Track 2 is a multi task track for building instance segmentation and height prediction, focusing on detecting individual buildings, building instance segmentation, and height information. This track has a large number of neighboring building clusters, which challenges the participants' ability to involve the algorithmic process.

Dataset Format

We provide individual RGB and SAR (Synthetic Aperture Radar) remote sensing images. For better use of multi-modal data, we provide a python script to generate 4-channel images concatenated in the channel dimension, in 4-channel (R,G,B,SAR) tif format. You can run the following command to generate the merge direcory:

python ./make_merge.py $DATASET_ROOT

All the images are in size of 512x512. In the instance segmentation sub-task, the data format follows the MS COCO format, and the annotation uses the json format. In the height prediction sub-task, the data annotation adopts the height ground true formed by the pixel by pixel elevation value corresponding to the RGB images, and the data uses tif format.

The topology of the dataset directory is as follows:

```
DFC_Track_2
├── annotations
│   └── buildings_only_train.json
│   └── buildings_only_val.json
│   └── buildings_only_test.json
├── train
│   └── rgb
│   │   ├── P0001.tif
│   │   └── ...
│   │   └── P0009.tif
│   └── sar
│   │   ├── P0001.tif
│   │   └── ...
│   │   └── P0009.tif
│   └── height
│       ├── P0001.tif
│       └── ...
│       └── P0009.tif
├── val
│   └── rgb
│   │   ├── P0011.tif
│   │   └── ...
│   │   └── P0019.tif
│   └── sar
│   │   ├── P0011.tif
│   │   └── ...
│   │   └── P0019.tif
│   └── height
│       ├── P0011.tif
│       └── ...
│       └── P0019.tif
└── test
    └── rgb
    │   ├── P0021.tif
    │   └── ...
    │   └── P0029.tif
    └── sar
    │   ├── P0021.tif
    │   └── ...
    │   └── P0029.tif
    └── height
        ├── P0021.tif
        └── ...
        └── P0029.tif
```

Baselines

We choose the classical mask rcnn with multimodal multitask learning (height prediction) framework as the contest baseline model. Among the input image modalities are RGB and SAR. We use MMDetection (version 2.25.1) to test the baseline model performance.
The performance report of multimodal multitask learning framework on the validation set of track 2 (instance segmentation and height prediction) is as follows:

Model Modality mAP mAP_50 Delta_1 Delta_2 Delta_3
our baseline RGB+SAR 15.1 41.1 30.1 35.3 39.6

Submission Format

The documents submitted should be a folder. The topology of the submitted floder directory is as follows:

```
DFC2023_Track_2_submit
├── seg_results.json
└── height
    ├── P0011.tif
    └── P0012.tif
    └── ...
    └── P0019.tif
```