[CVPR 2024 Workshops] SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks
Various implementations of SERNet-Former with different baselines for Multi-tasking (without our additional methods) is now online.
The example deploys ViT_h_14 baseline with 'Weights' 'IMAGENET1K_SWAG_E2E_V1' and simple U-Net decoder architecture.
Please also see the tutorials for
Image Segmentation based on DeepLabV3+_ResNet101 baseline
Image Classification based on ViT_h_14 baseline
16 May 2024
[CVPR 2024 Workshops] The article "SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks" is now accepted to CVPR 2024 Workshops. Equivariant Vision: From Theory to PracticeJanuary 2024
SERNet-Former set state-of-the-art result on Cityscapes validation dataset for pixel-level segmentation: 87.35 % mIoUJanuary 2024
SERNet-Former set state-of-the-art result on CamVid dataset: 84.62 % mIoUJanuary 2024
SERNet-Former ranked as the seventh on Cityscapes test dataset for pixel-level segmentation according to PapersWithCode.com: 84.83 % mIoU
(a) Attention-boosting Gate (AbG) and Attention-boosting Module (AbM) are fused into the encoder part.
(b) Attention-fusion Network (AfN), introduced into the decoder
The breakdown of class accuracies on CamVid dataset
Model | Baseline Architecture | Building | Tree | Sky | Car | Sign | Road | Pedestrian | Fence | Pole | Sidewalk | Bicycle | mIoU |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SERNet-Former | Efficient-ResNet | 93.0 | 88.8 | 95.1 | 91.9 | 73.9 | 97.7 | 76.4 | 83.4 | 57.3 | 90.3 | 83.1 | 84.62 |
The experiment outcomes on CamVid dataset
Model | Baseline Architecture | road | sidewalk | building | wall | fence | pole | traffic light | traffic sign | vegetation | terrain | sky | person | rider | car | truck | bus | train | motorcycle | bicycle | mIoU |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SERNet-Former | Efficient-ResNet | 98.2 | 90.2 | 94.0 | 67.6 | 68.2 | 73.6 | 78.2 | 82.1 | 94.6 | 75.9 | 96.9 | 90.0 | 77.7 | 96.9 | 86.1 | 93.9 | 91.7 | 70.0 | 82.9 | 84.83 |
The experiment outcomes on Cityscapes dataset
You can simply download this repository into your environment by running
git clone https://github.com/serdarch/SERNet-Former.git
@article{Erisen2024SERNetFormer,
title={SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks},
author={Erişen, Serdar},
journal={arXiv preprint arXiv:2401.15741},
year={2024}
}
@inproceedings{Erisen2024CVPRW,
title={SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks},
author={Erişen, Serdar},
booktitle={CVPRW},
year={2024},
}