Skip to content

Awesome collection of DyNN papers for Computer Vision and Sensor Fusion applications ✨

Notifications You must be signed in to change notification settings

DTU-PAS/awesome-dynn-for-cv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 

Repository files navigation

Awesome Dynamic Neural Networks Papers for Computer Vision and Sensor Fusion Applications

arXiv

A curated collection of Dynamic Neural Networks (DyNN) papers in the context of Computer Vision and Sensor Fusion Applications. This repository gathers the most relevant research that explores adaptive, dynamic, and efficient neural networks in a variety of settings including token skimming, early exits, and dynamic routing. A section dedicated to Sensor Fusion is also present.

A ArXiv preprint of the survey on the papers presented here can be found in A Survey on Dynamic Neural Networks: from Computer Vision to Multi-modal Sensor Fusion.

Citation

If you find this repository useful in your research, please consider citing it:

@misc{montelloSurveyDynamicNeural2025,
  author = {Montello, Fabio and G{\"u}ldenring, Ronja and Scardapane, Simone and Nalpantidis, Lazaros},
  title = {A {{Survey}} on {{Dynamic Neural Networks}}: From {{Computer Vision}} to {{Multi-modal Sensor Fusion}}},
  year = {2025},
  publisher = {arXiv}
}

Table of Contents

Early Exit

Type column legend: Architecture Structural network designs. Method Algorithmic innovations. Application Domain-specific implementations.

Publication Year Title Main contribution Type Code
2024 DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution Robotic Vision-Language-Action Model with situation-based early exits Application :octocat:
2024 A multi-level collaborative self-distillation learning for improving adaptive inference efficiency Self distillation with a dynamic generation of the importance weights Method
2024 AdaDet: An Adaptive Object Detection System Based on Early-Exit Neural Networks Early Exits for object detection Application
2024 EERO: Early Exit with Reject Option for Efficient Classification with limited budget The exiting is formalized as a classification with a reject option Method
2024 Jointly-Learned Exit and Inference for a Dynamic Neural Network : JEI-DNN Presents a loss for both accuracy and inference cost Method :octocat:
2024 Predicting Probabilities of Error to Combine Quantization and Early Exiting: QuEE Combines quantization and early exit dynamic network Method
2024 Fixing Overconfidence in Dynamic Neural Networks Bayesian method to highlight out of distribution samples Method :octocat:
2024 Multiple-Exit Tuning: Towards Inference-Efficient Adaptation for Vision Transformer Introduce an adapter for representations in a shared exit space Method
2024 To Exit or Not to Exit: Cost-Effective Early-Exit Architecture Based on Markov Decision Process Method based on the Markov decision process for the early-exit decision Method
2024 Class Based Thresholding in Early Exit Semantic Segmentation Networks Early Exits with threshold tailored to the class Application
2024 CAPEEN: Image Captioning with Early Exits and Knowledge Distillation ViT with Early Exits for image captioning Application :octocat:
2023 FIANCEE: Faster Inference of Adversarial Networks via Conditional Early Exits Early Exits applied to Generative Adversarial Networks Application
2023 Dynamic Perceiver for Efficient Visual Recognition Decouples the features extraction and the early classification Architecture :octocat:
2023 Predictive exit: prediction of fine-grained early exits for computation-and energy-efficient inference learned component to decide where to place the classifiers Method
2023 SEENN: Towards Temporal Spiking Early-Exit Neural Networks Early exit architecture applies to a Spiking Neural Network Application
2023 Self-supervised efficient sample weighting for multi-exit networks Balance the loss contribution by weight prediction Method
2023 Zero time waste in pre-trained early exit neural networks Applies Zero Time Waste to diffusion models Application :octocat:
2023 Dynamic Token Pruning in Plain Vision Transformers for Semantic Segmentation Token Early Exit for semantic segmentation Architecture :octocat:
2023 Window-Based Early-Exit Cascades for Uncertainty Estimation: When Deep Ensembles are More Efficient than Single Models Study of uncertainty estimation when it comes to Early Exit Method :octocat:
2023 LGViT: Dynamic Early Exiting for Accelerating Vision Transformer Self-distillation to train Early Exit ViT models Method :octocat:
2023 Adaptive Computation with Elastic Input Sequence ViT which has also the ability to read and store tokens Architecture
2022 Single-layer vision transformers for more accurate early exits with less overhead Early exitsaudiovisual crowd counting with Transformer Application
2022 Self-Distillation: Towards Efficient and Compact Neural Networks Experiments with various self-distillation techniques Method
2022 Boosted Dynamic Neural Networks Proposes a solution to the train-test mismatch problem Method :octocat:
2022 Meta-GF: Training Dynamic-Depth Neural Networks Harmoniously Weighted policy to alleviate gradient conflict problems Method :octocat:
2022 Temporal Early Exits for Efficient Video Object Detection Temporal early exits for video object detection Application
2022 A Probabilistic Re-Intepretation of Confidence Scores in Multi-Exit Models Train by weighting the prediction with a trained confidence score Method
2022 Multi-Exit Semantic Segmentation Networks Framework for early exit semantic segmentation Application
2022 Learning to Weight Samples for Dynamic Early-Exiting Networks The loss is provided by a weighting network Method :octocat:
2021 Zero Time Waste: Recycling Predictions in Early Exit Neural Networks Reuse the output of internal classifiers as inductive prediction Method
2021 Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition Image elaboration at different scales with early exits Architecture :octocat:
2021 Harmonized Dense Knowledge Distillation Training for Multi-Exit Architecture Dense knowledge distillation for each exit from all the later exits Method
2021 Empowering Adaptive Early-Exit Inference with Latency Awareness Threshold determination as non-convex problem Method
2021 Branchy-GNN: A Device-Edge Co-Inference Framework for Efficient Point Cloud Processing Early exits fraph neural network for point cloud Application :octocat:
2021 Adaptive Inference for Face Recognition leveraging Deep Metric Learning-enabled Early Exits Applies BoF early exits to face recognition Application
2021 Anytime Dense Prediction with Confidence Adaptivity Early Exits for semantic segmentation Application :octocat:
2021 Dynamic Early Exit Scheduling for Deep Neural Network Inference through Contextual Bandits Early exits for video analytics Application
2021 FrameExit: Conditional Early Exiting for Efficient Video Recognition Offline frame sampling strategy with early exits Application :octocat:
2021 Improving the Accuracy of Early Exits in Multi-Exit Architecture via Curriculum Learning Use of curriculum learning to train Method
2020 Edge AI: On-Demand Accelerating Deep Neural Network Inference via Edge Computing Split computations between on-device and cloud Application
2020 Learning to Stop While Learning to Predict Exit problem seen as variational Bayes Method :octocat:
2020 FlexDNN: Input-Adaptive On-Device Deep Learning for Efficient Mobile Vision Applies early exits to video analysis Application
2020 SPINN: synergistic progressive inference of neural networks over device and cloud Split computations between on-device and cloud Application
2020 HAPI: Hardware-Aware Progressive Inference First to investigate optimal exit positioning Method
2020 Efficient adaptive inference for deep convolutional neural networks using hierarchical early exits Bag-of-features + single classifier Method
2020 Differentiable Branching In Deep Networks for Fast Inference Weighting method to estimate exit confidence Method
2020 Early Exit or Not: Resource-Efficient Blind Quality Enhancement for Compressed Images Application on compressed image enanchement Application :octocat:
2020 Resolution Adaptive Networks for Efficient Inference Processes images at a coarser scale first Architecture :octocat:
2019 Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation Improves self-distillation loss Method :octocat:
2019 SEE: Scheduling Early Exit for Mobile DNN Inference during Service Outage Frame dropping according to budget Application
2019 DynExit: A Dynamic Early-Exit Strategy for Deep Residual Networks Dynamic loss-weight modification Method :octocat:
2019 Distillation-Based Training for Multi-Exit Architecture Introduce self-distillation Method
2019 Adaptive Inference Using Hierarchical Convolutional Bag-of-Features for Low-Power Embedded Platforms Bag-of-features + single classifier Method
2019 Improved Techniques for Training Adaptive Deep Networks Rescale gradient magnitude Method :octocat:
2019 Shallow-Deep Networks: Understanding and Mitigating Network Overthinking Slight variation wrt BranchyNet Architecture :octocat:
2018 Multi-Scale Dense Networks for Resource Efficient Image Classification Multi-scale architecture Architecture :octocat:
2017 Adaptive Neural Networks for Efficient Inference Combination with nets ensamble Architecture :octocat:
2016 BranchyNet: Fast inference via early exiting from deep neural networks First end-to-end network Architecture :octocat:
2016 Conditional Deep Learning for Energy-Efficient and Enhanced Pattern Recognition Seminal work Architecture

Dynamic Routing

Type column legend: Path Adaptive routing or processing paths in the network. Block Innovations involving layer or block-level adjustments. MoE Mixture of Experts. Application Domain-specific implementations.

Publication Year Title Main contribution Type Code
2024 DyFADet: Dynamic Feature Aggregation for Temporal Action Detection Adapts kernel weights and receptive field for different frames in Temporal Action Detection Application :octocat:
2024 Routers in Vision Mixture of Experts: An Empirical Study Studies different approaches of MoE in the context of ViT MoE
2024 Adaptive Layer Selection for Efficient Vision Transformer Fine-Tuning Layer skipping in the fine-tuning of ViT Block :octocat:
2023 Robust Mixture-of-Expert Training for Convolutional Neural Networks Discuss adversarial attacks and robustness in the context of MoE MoE :octocat:
2023 SegBlocks: Block-Based Dynamic Resolution Networks for Real-Time Segmentation Adjusts dynamically the processing resolution of image regions Path :octocat:
2023 GradMDM: Adversarial Attack on Dynamic Networks Studies adversarial attacks on dynamic models Application :octocat:
2023 DPACS: Hardware Accelerated Dynamic Neural Network Pruning through Algorithm-Architecture Co-design Spatial and channel pruning hardware accelerator Application :octocat:
2022 Dynamically Throttleable Neural Networks Self-regulate computations according to performance target and resources available Path :octocat:
2022 M3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design Method to accelerate MoE for multi-task ViT MoE :octocat:
2022 AdaViT: Adaptive Vision Transformers for Efficient Image Recognition Block skipping in Transformer Block :octocat:
2022 AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition One stage adaptive patch location for Video Recognition Application :octocat:
2022 AdaFocusV3: On Unified Spatial-Temporal Dynamic Video Recognition Adaptive patch location for Video Recognition with conditional exits Application
2021 Adaptive Focus for Efficient Video Recognition Adaptive patch location for Video Recognition Application :octocat:
2021 Dynamic Network Quantization for Efficient Video Inference Adaptive model precision and frame skipping for Video Recognition Application
2021 Scaling Vision with Sparse Mixture of Experts MoE in the context of ViT MoE :octocat:
2021 Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation Adaptive routing path for Object Detection Application :octocat:
2021 Dynamic Dual Gating Neural Networks Highlight the informative features in both the channel and spatial dimensions Block :octocat:
2021 DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning Address the problem of slow convergence in sparse gates MoE
2021 Processor Architecture Optimization for Spatially Dynamic Neural Networks Investigates hardware constraints in the context of DyNN Application
2020 Deep Mixture of Experts via Shallow Embedding MoE at the level of convolutional filters, composed on-the-fly MoE
2020 Dual Dynamic Inference: Enabling More Efficient, Adaptive, and Controllable Deep Inference Layer and channel skipping for IoT applications Application
2020 Dynamic Convolutions: Exploiting Spatial Sparsity for Faster Inference Spatially execute convolutional filters only on important image patches Block :octocat:
2020 Using Mixture of Expert Models to Gain Insights into Semantic Segmentation MoE for interpretability of Semantic Segmentaton Application
2020 Learning Dynamic Routing for Semantic Segmentation Adapts the scale at which each image gets processed Path :octocat:
2020 Learning Layer-Skippable Inference Network MoE combinate with layer skipping MoE
2020 Learning to Generate Content-Aware Dynamic Detectors Models the relationship between the sample space and the latent routing space Path
2020 Biased Mixtures of Experts: Enabling Computer Vision Inference Under Data Transfer Limitations MoE with inductive prior bias to certain experts MoE
2020 Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference Layer-wise adaptive quantization and skip Block :octocat:
2019 You Look Twice: GaterNet for Dynamic Filter Selection in CNNs Global gating module for channel selection Block
2019 Channel Gating Neural Networks Learn specialized convolutional kernels as combination of learnt experts MoE
2019 Dynamic Channel Pruning: Feature Boosting and Suppression Skip negligible input and output channels Block :octocat:
2018 Convolutional Networks with Adaptive Inference Graphs Network with adaptive inference graphs Path :octocat:
2018 Dynamic Deep Neural Networks: Optimizing Accuracy-Efficiency Trade-Offs by Selective Execution Selective execution with self-defined topolog Path
2018 Routing Networks: Adaptive Selection of Non-linear Functions for Multi-Task Learning Self-organizing network for multimodal learning Path :octocat:
2018 HydraNets: Specialized Dynamic Architectures for Efficient Inference MoE for features of visually similar classes MoE :octocat:
2018 SkipNet: Learning Dynamic Routing in Convolutional Networks Selectively skip Convolutional blocks based on the previous layer Block :octocat:
2018 EnergyNet: Energy-Efficient Dynamic Inference RNN for skip decision of CNN blocks Block
2018 BlockDrop: Dynamic Inference Paths in Residual Networks RL policy network for residual blocks skipping Block :octocat:
2017 Hard Mixtures of Experts for Large Scale Weakly Supervised Vision MoE of pretrained experts MoE
2017 Spatially Adaptive Computation Time for Residual Networks Dynamically adjusts the number of layers for certain regions of the image Block :octocat:
2017 Adaptive Neural Networks for Efficient Inference Ensemble of networks chained in an acyclic computation graph Path :octocat:
2017 Runtime Neural Pruning Dynamic channel pruning Block
2017 Changing Model Behavior at Test-Time Using Reinforcement Learning Combination of MoE and Early Exits MoE
2016 Network of Experts for Large-Scale Image Categorization Introduces MoE architecture for CNN MoE :octocat:

Token skimming

Type column legend Drop Dynamic token removal or skipping for efficiency. Merge Methods for aggregating similar tokens. Application Domain-specific implementations.

Publication Year Title Main contribution Type Code
2024 No Token Left Behind: Efficient Vision Transformer via~Dynamic Token Idling Per block skip based on the attention score Drop :octocat:
2024 GTP-ViT: Efficient Vision Transformers via Graph-based Token Propagation Uses a Graph-based Token Propagation (GTP) as a merging policy Merge :octocat:
2024 ATFTrans: Attention-Weighted Token Fusion Transformer for Robust and Efficient Object Tracking Token merging for Object Tracking Application
2024 Scene Adaptive Sparse Transformer for Event-based Object Detection Token dropout for Event cameras Application :octocat:
2024 Revisiting Token Pruning for Object Detection and Instance Segmentation Token Skimming for object detection and semantic segmentation Application :octocat:
2024 Token Fusion: Bridging the Gap between Token Pruning and Token Merging Aggregation based on both token pruning and token merging Application
2024 Adaptive Semantic Token Selection for AI-native Goal-oriented Communications Token skimming for variable latency and bandwidth constraints in communication channels Application
2023 Neighbor Patches Merging Reduces Spatial Redundancy of Nature Images Token merging based on the similarity of pixel patches Merge
2023 Joint Token Pruning and Squeezing Towards More Aggressive Compression of Vision Transformers Scoring method based on Gumbel-Softmax for merging tokens Merge :octocat:
2023 Token Merging with Class Importance Score Skimming based on weighted average according to a importance score of the token Merge
2023 Content-Aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers Token reduction for ViTs Semantic Segmentation Application :octocat:
2023 Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision Transformers Merge tokens according to importance and global token diversity Merge :octocat:
2023 Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference Per block skip according to the output of inferred score Drop
2023 MSViT: Dynamic Mixed-scale Tokenization for Vision Transformers Selects the optimal token scale for every image region Application
2023 SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer Introduce activation sparsity for Swin-based models Application :octocat:
2023 Efficient Video Action Detection with Token Dropout and Context Refinement Token dropout for Video Task Recognition Application :octocat:
2023 Token Merging for Fast Stable Diffusion Applies ToMe algorithm to diffusion models Merge :octocat:
2023 Token Merging: Your ViT But Faster Gradually combines similar tokens with a custom matching algorithm Merge :octocat:
2022 A-ViT: Adaptive Tokens for Efficient Vision Transformer Halt tokens by accumulative importance, with bias target exit depth. Drop :octocat:
2022 Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer The least important tokens are summarized by a representative token Merge :octocat:
2022 Dynamic Transformer Networks At each block, an function evaluates which tokens should attend it Drop
2022 SaiT: Sparse Vision Transformers through Adaptive Token Pruning Tokens selection based on the weights of the attention block Drop
2022 Not All Patches Are What You Need: Expediting Vision Transformers via Token Reorganizations Merges according to the attention weights of multiple heads at specific blocks Merge :octocat:
2022 Adaptive Token Sampling for Efficient Vision Transformers Selction based on attention weights of the classification token Drop :octocat:
2021 IA-RED2: Interpretability-Aware Redundancy Reduction for Vision Transformers (Supplementary Material) Train per block in a curriculum-learning manner with RL Drop
2021 DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification Pruning of redundant tokens progressively and dynamically Drop :octocat:
2021 Chasing Sparsity in Vision Transformers: An End-to-End Exploration Combination of methods to induce token sparsity Drop :octocat:

Sensor Fusion

Publication Year Title Main contribution Task Code
2024 Adaptive Data Fusion for State Estimation and Control of Power Grids Under Attack Fusion for power grid state estimation to create robustness towards attacks Regression
2023 Multi-Modal Gated Mixture of Local-to-Global Experts for Dynamic Image Fusion Multimodal mixture of local-to-global experts Object Detection :octocat:
2023 MIXO: Mixture Of Experts-Based Visual Odometry for Multicamera Autonomous Systems Local optimal expert selection for multicamera visual odometry Odometry
2023 Stress Detection Using Context-Aware Sensor Fusion From Wearable Devices context-aware sensor fusion for stress detection on embedded devices Stress Detection
2023 CARMA: Context-Aware Runtime Reconfiguration for Energy-Efficient Sensor Fusion Fusion approach to dynamically reconfigure FPGA at runtime Any
2022 Romanus: Robust Task Offloading in Modular Multi-Sensor Autonomous Driving Systems Dynamic offload of sensor process to edge computing units Object Detection
2022 EcoFusion: Energy-Aware Adaptive Sensor Fusion for Efficient Autonomous Vehicle Perception Adds the environmental context to dynamic sensor fusion Object Detection
2022 HydraFusion: Context-Aware Selective Sensor Fusion for Robust and Efficient Autonomous Vehicle Perception Sensor fusion selection to perform Object Detection Object Detection :octocat:
2019 Selective Sensor Fusion for Neural Visual-Inertial Odometry Selective fusion of images and IMU Odometry :octocat:
2018 Modular Sensor Fusion for Semantic Segmentation Late fusion approachfor semantic segmentation from the output of separately trained experts Semantic Segmentation :octocat:
2018 Estimation of Steering Angle and Collision Avoidance for Automated Driving Using Deep Mixture of Experts Road scenes and driving patterns based fusion for steering angle estimation Steering Prediction
2017 AdapNet: Adaptive Semantic Segmentation in Adverse Environmental Conditions Convolutional MoE to dynamically fuse different modalities Semantic Segmentation :octocat:

Contributions

Contributions of new awesome DyNN for CV and SF resources are very welcome! Please submit a pull request; if you add a new entry, please give a very brief explanation why you think it should be added.

About

Awesome collection of DyNN papers for Computer Vision and Sensor Fusion applications ✨

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published