A curated collection of Dynamic Neural Networks (DyNN) papers in the context of Computer Vision and Sensor Fusion Applications. This repository gathers the most relevant research that explores adaptive, dynamic, and efficient neural networks in a variety of settings including token skimming, early exits, and dynamic routing. A section dedicated to Sensor Fusion is also present.
A ArXiv preprint of the survey on the papers presented here can be found in A Survey on Dynamic Neural Networks: from Computer Vision to Multi-modal Sensor Fusion.
If you find this repository useful in your research, please consider citing it:
@misc{montelloSurveyDynamicNeural2025,
author = {Montello, Fabio and G{\"u}ldenring, Ronja and Scardapane, Simone and Nalpantidis, Lazaros},
title = {A {{Survey}} on {{Dynamic Neural Networks}}: From {{Computer Vision}} to {{Multi-modal Sensor Fusion}}},
year = {2025},
publisher = {arXiv}
}
Type column legend: Architecture Structural network designs. Method Algorithmic innovations. Application Domain-specific implementations.
Type column legend: Path Adaptive routing or processing paths in the network. Block Innovations involving layer or block-level adjustments. MoE Mixture of Experts. Application Domain-specific implementations.
Publication Year | Title | Main contribution | Type | Code |
---|---|---|---|---|
2024 | DyFADet: Dynamic Feature Aggregation for Temporal Action Detection | Adapts kernel weights and receptive field for different frames in Temporal Action Detection | Application | ![]() |
2024 | Routers in Vision Mixture of Experts: An Empirical Study | Studies different approaches of MoE in the context of ViT | MoE | |
2024 | Adaptive Layer Selection for Efficient Vision Transformer Fine-Tuning | Layer skipping in the fine-tuning of ViT | Block | ![]() |
2023 | Robust Mixture-of-Expert Training for Convolutional Neural Networks | Discuss adversarial attacks and robustness in the context of MoE | MoE | ![]() |
2023 | SegBlocks: Block-Based Dynamic Resolution Networks for Real-Time Segmentation | Adjusts dynamically the processing resolution of image regions | Path | ![]() |
2023 | GradMDM: Adversarial Attack on Dynamic Networks | Studies adversarial attacks on dynamic models | Application | ![]() |
2023 | DPACS: Hardware Accelerated Dynamic Neural Network Pruning through Algorithm-Architecture Co-design | Spatial and channel pruning hardware accelerator | Application | ![]() |
2022 | Dynamically Throttleable Neural Networks | Self-regulate computations according to performance target and resources available | Path | ![]() |
2022 | M3ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design | Method to accelerate MoE for multi-task ViT | MoE | ![]() |
2022 | AdaViT: Adaptive Vision Transformers for Efficient Image Recognition | Block skipping in Transformer | Block | ![]() |
2022 | AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition | One stage adaptive patch location for Video Recognition | Application | ![]() |
2022 | AdaFocusV3: On Unified Spatial-Temporal Dynamic Video Recognition | Adaptive patch location for Video Recognition with conditional exits | Application | |
2021 | Adaptive Focus for Efficient Video Recognition | Adaptive patch location for Video Recognition | Application | ![]() |
2021 | Dynamic Network Quantization for Efficient Video Inference | Adaptive model precision and frame skipping for Video Recognition | Application | |
2021 | Scaling Vision with Sparse Mixture of Experts | MoE in the context of ViT | MoE | ![]() |
2021 | Learning Dynamic Network Using a Reuse Gate Function in Semi-supervised Video Object Segmentation | Adaptive routing path for Object Detection | Application | ![]() |
2021 | Dynamic Dual Gating Neural Networks | Highlight the informative features in both the channel and spatial dimensions | Block | ![]() |
2021 | DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning | Address the problem of slow convergence in sparse gates | MoE | |
2021 | Processor Architecture Optimization for Spatially Dynamic Neural Networks | Investigates hardware constraints in the context of DyNN | Application | |
2020 | Deep Mixture of Experts via Shallow Embedding | MoE at the level of convolutional filters, composed on-the-fly | MoE | |
2020 | Dual Dynamic Inference: Enabling More Efficient, Adaptive, and Controllable Deep Inference | Layer and channel skipping for IoT applications | Application | |
2020 | Dynamic Convolutions: Exploiting Spatial Sparsity for Faster Inference | Spatially execute convolutional filters only on important image patches | Block | ![]() |
2020 | Using Mixture of Expert Models to Gain Insights into Semantic Segmentation | MoE for interpretability of Semantic Segmentaton | Application | |
2020 | Learning Dynamic Routing for Semantic Segmentation | Adapts the scale at which each image gets processed | Path | ![]() |
2020 | Learning Layer-Skippable Inference Network | MoE combinate with layer skipping | MoE | |
2020 | Learning to Generate Content-Aware Dynamic Detectors | Models the relationship between the sample space and the latent routing space | Path | |
2020 | Biased Mixtures of Experts: Enabling Computer Vision Inference Under Data Transfer Limitations | MoE with inductive prior bias to certain experts | MoE | |
2020 | Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference | Layer-wise adaptive quantization and skip | Block | ![]() |
2019 | You Look Twice: GaterNet for Dynamic Filter Selection in CNNs | Global gating module for channel selection | Block | |
2019 | Channel Gating Neural Networks | Learn specialized convolutional kernels as combination of learnt experts | MoE | |
2019 | Dynamic Channel Pruning: Feature Boosting and Suppression | Skip negligible input and output channels | Block | ![]() |
2018 | Convolutional Networks with Adaptive Inference Graphs | Network with adaptive inference graphs | Path | ![]() |
2018 | Dynamic Deep Neural Networks: Optimizing Accuracy-Efficiency Trade-Offs by Selective Execution | Selective execution with self-defined topolog | Path | |
2018 | Routing Networks: Adaptive Selection of Non-linear Functions for Multi-Task Learning | Self-organizing network for multimodal learning | Path | ![]() |
2018 | HydraNets: Specialized Dynamic Architectures for Efficient Inference | MoE for features of visually similar classes | MoE | ![]() |
2018 | SkipNet: Learning Dynamic Routing in Convolutional Networks | Selectively skip Convolutional blocks based on the previous layer | Block | ![]() |
2018 | EnergyNet: Energy-Efficient Dynamic Inference | RNN for skip decision of CNN blocks | Block | |
2018 | BlockDrop: Dynamic Inference Paths in Residual Networks | RL policy network for residual blocks skipping | Block | ![]() |
2017 | Hard Mixtures of Experts for Large Scale Weakly Supervised Vision | MoE of pretrained experts | MoE | |
2017 | Spatially Adaptive Computation Time for Residual Networks | Dynamically adjusts the number of layers for certain regions of the image | Block | ![]() |
2017 | Adaptive Neural Networks for Efficient Inference | Ensemble of networks chained in an acyclic computation graph | Path | ![]() |
2017 | Runtime Neural Pruning | Dynamic channel pruning | Block | |
2017 | Changing Model Behavior at Test-Time Using Reinforcement Learning | Combination of MoE and Early Exits | MoE | |
2016 | Network of Experts for Large-Scale Image Categorization | Introduces MoE architecture for CNN | MoE | ![]() |
Type column legend Drop Dynamic token removal or skipping for efficiency. Merge Methods for aggregating similar tokens. Application Domain-specific implementations.
Publication Year | Title | Main contribution | Type | Code |
---|---|---|---|---|
2024 | No Token Left Behind: Efficient Vision Transformer via~Dynamic Token Idling | Per block skip based on the attention score | Drop | ![]() |
2024 | GTP-ViT: Efficient Vision Transformers via Graph-based Token Propagation | Uses a Graph-based Token Propagation (GTP) as a merging policy | Merge | ![]() |
2024 | ATFTrans: Attention-Weighted Token Fusion Transformer for Robust and Efficient Object Tracking | Token merging for Object Tracking | Application | |
2024 | Scene Adaptive Sparse Transformer for Event-based Object Detection | Token dropout for Event cameras | Application | ![]() |
2024 | Revisiting Token Pruning for Object Detection and Instance Segmentation | Token Skimming for object detection and semantic segmentation | Application | ![]() |
2024 | Token Fusion: Bridging the Gap between Token Pruning and Token Merging | Aggregation based on both token pruning and token merging | Application | |
2024 | Adaptive Semantic Token Selection for AI-native Goal-oriented Communications | Token skimming for variable latency and bandwidth constraints in communication channels | Application | |
2023 | Neighbor Patches Merging Reduces Spatial Redundancy of Nature Images | Token merging based on the similarity of pixel patches | Merge | |
2023 | Joint Token Pruning and Squeezing Towards More Aggressive Compression of Vision Transformers | Scoring method based on Gumbel-Softmax for merging tokens | Merge | ![]() |
2023 | Token Merging with Class Importance Score | Skimming based on weighted average according to a importance score of the token | Merge | |
2023 | Content-Aware Token Sharing for Efficient Semantic Segmentation with Vision Transformers | Token reduction for ViTs Semantic Segmentation | Application | ![]() |
2023 | Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision Transformers | Merge tokens according to importance and global token diversity | Merge | ![]() |
2023 | Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference | Per block skip according to the output of inferred score | Drop | |
2023 | MSViT: Dynamic Mixed-scale Tokenization for Vision Transformers | Selects the optimal token scale for every image region | Application | |
2023 | SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer | Introduce activation sparsity for Swin-based models | Application | ![]() |
2023 | Efficient Video Action Detection with Token Dropout and Context Refinement | Token dropout for Video Task Recognition | Application | ![]() |
2023 | Token Merging for Fast Stable Diffusion | Applies ToMe algorithm to diffusion models | Merge | ![]() |
2023 | Token Merging: Your ViT But Faster | Gradually combines similar tokens with a custom matching algorithm | Merge | ![]() |
2022 | A-ViT: Adaptive Tokens for Efficient Vision Transformer | Halt tokens by accumulative importance, with bias target exit depth. | Drop | ![]() |
2022 | Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer | The least important tokens are summarized by a representative token | Merge | ![]() |
2022 | Dynamic Transformer Networks | At each block, an function evaluates which tokens should attend it | Drop | |
2022 | SaiT: Sparse Vision Transformers through Adaptive Token Pruning | Tokens selection based on the weights of the attention block | Drop | |
2022 | Not All Patches Are What You Need: Expediting Vision Transformers via Token Reorganizations | Merges according to the attention weights of multiple heads at specific blocks | Merge | ![]() |
2022 | Adaptive Token Sampling for Efficient Vision Transformers | Selction based on attention weights of the classification token | Drop | ![]() |
2021 | IA-RED2: Interpretability-Aware Redundancy Reduction for Vision Transformers (Supplementary Material) | Train per block in a curriculum-learning manner with RL | Drop | |
2021 | DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification | Pruning of redundant tokens progressively and dynamically | Drop | ![]() |
2021 | Chasing Sparsity in Vision Transformers: An End-to-End Exploration | Combination of methods to induce token sparsity | Drop | ![]() |
Publication Year | Title | Main contribution | Task | Code |
---|---|---|---|---|
2024 | Adaptive Data Fusion for State Estimation and Control of Power Grids Under Attack | Fusion for power grid state estimation to create robustness towards attacks | Regression | |
2023 | Multi-Modal Gated Mixture of Local-to-Global Experts for Dynamic Image Fusion | Multimodal mixture of local-to-global experts | Object Detection | ![]() |
2023 | MIXO: Mixture Of Experts-Based Visual Odometry for Multicamera Autonomous Systems | Local optimal expert selection for multicamera visual odometry | Odometry | |
2023 | Stress Detection Using Context-Aware Sensor Fusion From Wearable Devices | context-aware sensor fusion for stress detection on embedded devices | Stress Detection | |
2023 | CARMA: Context-Aware Runtime Reconfiguration for Energy-Efficient Sensor Fusion | Fusion approach to dynamically reconfigure FPGA at runtime | Any | |
2022 | Romanus: Robust Task Offloading in Modular Multi-Sensor Autonomous Driving Systems | Dynamic offload of sensor process to edge computing units | Object Detection | |
2022 | EcoFusion: Energy-Aware Adaptive Sensor Fusion for Efficient Autonomous Vehicle Perception | Adds the environmental context to dynamic sensor fusion | Object Detection | |
2022 | HydraFusion: Context-Aware Selective Sensor Fusion for Robust and Efficient Autonomous Vehicle Perception | Sensor fusion selection to perform Object Detection | Object Detection | ![]() |
2019 | Selective Sensor Fusion for Neural Visual-Inertial Odometry | Selective fusion of images and IMU | Odometry | ![]() |
2018 | Modular Sensor Fusion for Semantic Segmentation | Late fusion approachfor semantic segmentation from the output of separately trained experts | Semantic Segmentation | ![]() |
2018 | Estimation of Steering Angle and Collision Avoidance for Automated Driving Using Deep Mixture of Experts | Road scenes and driving patterns based fusion for steering angle estimation | Steering Prediction | |
2017 | AdapNet: Adaptive Semantic Segmentation in Adverse Environmental Conditions | Convolutional MoE to dynamically fuse different modalities | Semantic Segmentation | ![]() |
Contributions of new awesome DyNN for CV and SF resources are very welcome! Please submit a pull request; if you add a new entry, please give a very brief explanation why you think it should be added.