Skip to content
This repository has been archived by the owner on Nov 28, 2024. It is now read-only.
/ SERNet-Former Public archive

[CVPR 2024 Workshops] SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks

License

Notifications You must be signed in to change notification settings

serdarch/SERNet-Former

Repository files navigation

SERNet-Former

[CVPR 2024 Workshops] YouTube Video CVPR 2024 Workshop ArXiv paper CVMI 2024

[CVPR 2024 Workshops] SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks

Tutorials

Various implementations of SERNet-Former with different baselines for Multi-tasking (without our additional methods) is now online.

The example deploys ViT_h_14 baseline with 'Weights' 'IMAGENET1K_SWAG_E2E_V1' and simple U-Net decoder architecture. Open In Colab

Please also see the tutorials for

Image Segmentation based on DeepLabV3+_ResNet101 baseline Open In Colab

Image Classification based on ViT_h_14 baseline Open In Colab

News

  • 16 May 2024 [CVPR 2024 Workshops] The article "SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks" is now accepted to CVPR 2024 Workshops. Equivariant Vision: From Theory to Practice
  • January 2024 SERNet-Former set state-of-the-art result on Cityscapes validation dataset for pixel-level segmentation: 87.35 % mIoU
  • January 2024 SERNet-Former set state-of-the-art result on CamVid dataset: 84.62 % mIoU
  • January 2024 SERNet-Former ranked as the seventh on Cityscapes test dataset for pixel-level segmentation according to PapersWithCode.com: 84.83 % mIoU

GitHub Badges

PWC

PWC

PWC

PWC

PWC

PWC

PWC

SERNet-Former Conceptual

Efficient-ResNet

Figure1

(a) Attention-boosting Gate (AbG) and Attention-boosting Module (AbM) are fused into the encoder part.

(b) Attention-fusion Network (AfN), introduced into the decoder

Experiment Results

CamVid Dataset

The breakdown of class accuracies on CamVid dataset

Model Baseline Architecture Building Tree Sky Car Sign Road Pedestrian Fence Pole Sidewalk Bicycle mIoU
SERNet-Former Efficient-ResNet 93.0 88.8 95.1 91.9 73.9 97.7 76.4 83.4 57.3 90.3 83.1 84.62

The experiment outcomes on CamVid dataset

camvid_output

Cityscapes

Model Baseline Architecture road sidewalk building wall fence pole traffic light traffic sign vegetation terrain sky person rider car truck bus train motorcycle bicycle mIoU
SERNet-Former Efficient-ResNet 98.2 90.2 94.0 67.6 68.2 73.6 78.2 82.1 94.6 75.9 96.9 90.0 77.7 96.9 86.1 93.9 91.7 70.0 82.9 84.83

The experiment outcomes on Cityscapes dataset

cityscapes_output

Installation Support

You can simply download this repository into your environment by running

git clone https://github.com/serdarch/SERNet-Former.git

Citations

@article{Erisen2024SERNetFormer,
  title={SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks},
  author={Erişen, Serdar},
  journal={arXiv preprint arXiv:2401.15741},
  year={2024}
}

@inproceedings{Erisen2024CVPRW,
  title={SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks},
  author={Erişen, Serdar},
  booktitle={CVPRW},
  year={2024},
}

About

[CVPR 2024 Workshops] SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages