- activation
- active learning
- adaptation
- adversarial training
- antialiasing
- asr
- attention
- augmentation
- autoregressive model
- backbone
- bayesian
- bert
- bias
- calibration
- causality
- channel attention
- chat
- computation
- continual learning
- contrastive learning
- convolution
- dataset
- ddpm
- decoding
- deep prior
- differentiable operator
- differentiable tree
- discrete vae
- disentangle
- distillation
- distributed training
- domain adaptation
- dropout
- efficient attention
- embedding
- end2end
- energy based model
- ensemble
- federated learning
- few shot
- finetuning
- flow
- fpn
- gan
- gan inversion
- generalization
- generative model
- graph
- hallucination
- hypernetwork
- hyperparameter
- identifiability
- image editing
- image generation
- img2img
- implicit model
- implicit representation
- instance segmentation
- interpolation
- knowledge base
- language generation
- language model
- layout
- lightweight
- line
- lm
- local attention
- loss
- loss surface
- matting
- memory
- meta learning
- metric learning
- mixup
- mlm
- multimodal
- multitask
- nas
- nerf
- neural computer
- neural ode
- neural rendering
- nlp
- nmt
- noise
- non autoregressive
- norm free
- normalization
- object detection
- ocr
- optimization
- optimizer
- oriented object detection
- out of distribution
- panoptic segmentation
- perceptual loss
- pooling
- pose
- positional encoding
- pretraining
- probabilistic model
- pruning
- qa
- reasoning
- regularization
- reinforcement learning
- rendering
- representation
- resampling
- restoration
- review
- robustness
- saliency
- salient object detection
- scale
- score
- self supervised
- self supervised discovery
- semantic factor
- semantic segmentation
- semi supervised learning
- sgld
- single image
- speech
- structure learning
- style transfer
- stylegan
- super resolution
- text generation
- topic model
- topology
- tracking
- training
- transducer
- transfer
- transformer
- tropical geometry
- tts
- unsupervised img2img
- unsupervised nmt
- vae
- video
- video transformer
- vision
- vision language
- vision transformer
- visual grounding
- vit
- vocoder
- weak supervision
- uncategorized
- 201120 An Effective Anti-Aliasing Approach for Residual Networks
- 201128 Truly shift-invariant convolutional neural networks
- 200220 Imputer #non-autoregressive #ctc
- 200510 Listen Attentively, and Spell Once #non-autoregressive
- 200516 Large scale weakly and semi-supervised learning for low-resource video ASR #weak_supervision #semi_supervised_learning
- 200516 Reducing Spelling Inconsistencies in Code-Switching ASR using #ctc
- 200516 Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition #non-autoregressive
- 200518 Attention-based Transducer for Online Speech Recognition #transducer
- 200518 Iterative Pseudo-Labeling for Speech Recognition
- 200519 Distilling Knowledge from Ensembles of Acoustic Models for Joint CTC-Attention End-to-End Speech Recognition #ctc
- 200519 Improved Noisy Student Training for Automatic Speech Recognition #semi_supervised_learning
- 200729 Developing RNN-T Models Surpassing High-Performance Hybrid Models with #rnn_t
- 201021 FastEmit #transducer #decoding
- 201027 CASS-NAT #non-autoregressive
- 201125 Streaming end-to-end multi-talker speech recognition #transducer
- 210524 Unsupervised Speech Recognition #unsupervised_training
- 200122 Object Contextual Representations #semantic_segmentation
- 200129 Empirical Attention
- 200130 Axial Attention #generative_model
- 200130 Criss-Cross Attention #semantic_segmentation
- 200212 Capsules with Inverted Dot-Product Attention Routing #capsule
- 200219 Tree-structured Attention with Hierarchical Accumulation #parse
- 200226 Sparse Sinkhorn Attention #sparse_attention
- 200317 Axial-DeepLab #panoptic_segmentation
- 200404 Neural Architecture Search for Lightweight Non-Local Networks
- 200421 Attention is Not Only a Weight #bert
- 200423 Self-Attention Attribution #bert
- 200428 Exploring Self-attention for Image Recognition
- 200510 CTC-synchronous Training for Monotonic Attention Model #asr #ctc
- 200516 Streaming Transformer-based Acoustic Models Using Self-attention with Augmented Memory #asr #memory
- 200519 Normalized Attention Without Probability Cage
- 200519 Staying True to Your Word
- 200626 Object-Centric Learning with Slot Attention
- 201119 On the Dynamics of Training Attention Models #training
- 210223 Linear Transformers Are Secretly Fast Weight Memory Systems #linear_attention #efficient_attention
- 210225 LazyFormer #bert
- 210517 Pay Attention to MLPs #mlp
- 210524 Self-Attention Networks Can Process Bounded Hierarchical Languages #nlp
- 200122 FixMatch #semi_supervised_learning #manifold #mixup
- 200220 Affinity and Diversity
- 200621 AdvAug #mixup #nlp #adversarial_training
- 200710 Meta-Learning Requires Meta-Augmentation #metalearning
- 201117 Sequence-Level Mixed Sample Data Augmentation #nlp
- 201125 Can Temporal Information Help with Contrastive Self-Supervised Learning #video #self_supervised
- 201213 Simple Copy-Paste is a Strong Data Augmentation Method for Instance #instance_segmentation
- 201214 Improving Panoptic Segmentation at All Scales #panoptic_segmentation
- 210318 AlignMix #mixup
- 210318 TrivialAugment
- 210429 Ensembling with Deep Generative Views #ensemble #gan_inversion
- 200129 Semi Autorgressive Training
- 201027 Scaling Laws for Autoregressive Generative Modeling #scale
- 190724 MixNet #convolution
- 200123 Antialiasing #invariance
- 200128 Attentive Normalization
- 200128 IBN-Net
- 200128 Selective Kernel
- 200128 SpineNet
- 200128 Squeeze-Excitation
- 200128 Switchable Normalization
- 200128 Switchable Whitening
- 200129 Assembled Techniques #regularization
- 200129 DenseNet
- 200129 Dual Path Networks
- 200129 HarDNet
- 200129 PyramidNet
- 200129 SelecSLS
- 200129 ShuffleNet V2 #efficiency
- 200129 VoVNet
- 200130 FishNet
- 200130 HRNet
- 200130 MixConv #convolution
- 200330 Designing Network Design Spaces #hypernetwork
- 200330 TResNet #antialiasing
- 200419 ResNeSt
- 200630 Deep Isometric Learning for Visual Recognition #normalization #resnet #cnn #norm_free
- 200712 PSConv #cnn #multiscale
- 201015 HS-ResNet #multiscale
- 201221 FcaNet #channel_attention
- 210226 Transformer in Transformer #vision_transformer
- 210310 Involution #convolution #attention
- 210312 Revisiting ResNets #resnet
- 210317 Learning to Resize Images for Computer Vision Tasks #resizing
- 210331 EfficientNetV2
- 210408 SI-Score #robustness #vision_transformer
- 210505 RepMLP #mlp
- 210506 Do You Even Need Attention #mlp
- 210510 ResMLP #mlp
- 200207 Bayes Posterior
- 200210 Liberty or Depth #mean_field
- 200220 Neural Bayes #representation #clustering
- 200514 Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors #ensemble #variational_inference
- 200305 What the [MASK]
- 200427 DeeBERT #lightweight
- 200518 Audio ALBERT #audio #representation
- 200601 Amnesic Probing
- 200608 On the Stability of Fine-tuning BERT #finetuning
- 200610 Revisiting Few-sample BERT Fine-tuning #finetuning
- 200519 Identifying Statistical Bias in Dataset Replication
- 201202 Learning from others' mistakes #product_of_experts
- 200221 Calibrating Deep Neural Networks using Focal Loss #loss
- 200223 Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks #bayesian
- 200620 Regression Prior Networks
- 200630 PLATO-2 #text_gen #chatbot
- 200213 Training Large Neural Networks with Constant Memory using a New Execution Algorithm
- 201204 Nimble
- 200213 A Simple Framework for Contrastive Learning of Visual Representations #augmentation
- 200309 Improved Baselines with Momentum Contrastive Learning
- 200423 Supervised Contrastive Learning #metric_learning
- 200511 Prototypical Contrastive Learning of Unsupervised Representations
- 200520 What Makes for Good Views for Contrastive Learning
- 200613 Bootstrap your own latent
- 200630 Debiased Contrastive Learning
- 200730 Contrastive Learning for Unpaired Image-to-Image Translation #img2img
- 200803 LoCo
- 201020 BYOL works even without batch statistics
- 201109 Towards Domain-Agnostic Contrastive Learning #mixup #multimodal
- 201116 AdCo #adversarial_training
- 201117 Dense Contrastive Learning for Self-Supervised Visual Pre-Training
- 201119 Heterogeneous Contrastive Learning
- 201119 Propagate Yourself
- 201121 Run Away From your Teacher
- 201123 Boosting Contrastive Self-Supervised Learning with False Negative
- 201126 Beyond Single Instance Multi-view Unsupervised Representation Learning #self_supervised #mixup
- 201126 How Well Do Self-Supervised Models Transfer #self_supervised #transfer
- 201127 Self-EMD
- 201201 Towards Good Practices in Self-supervised Representation Learning #self_supervised
- 201204 Seed the Views #mixup
- 201212 Contrastive Learning for Label-Efficient Semantic Segmentation #semantic_segmentation
- 201221 Online Bag-of-Visual-Words Generation for Unsupervised Representation #self_supervised #discrete_vae
- 201226 Spatial Contrastive Learning for Few-Shot Classification #few_shot #attention
- 210304 Barlow Twins #self_supervised #backbone
- 210325 Rethinking Self-Supervised Learning #training
- 210405 An Empirical Study of Training Self-Supervised Vision Transformers #vision_transformer
- 210426 Multimodal Contrastive Training for Visual Representation Learning #multimodal
- 210429 A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning #video
- 210429 Emerging Properties in Self-Supervised Vision Transformers #saliency #vision_transformer #representation
- 210429 With a Little Help from My Friends #knn
- 210510 Self-Supervised Learning with Swin Transformers #vision_transformer
- 210511 VICReg
- 210517 Divide and Contrast #self_supervised #dataset #distillation
- 210601 Exploring the Diversity and Invariance in Yourself for Visual Pre-Training Task
- 200509 Building a Manga Dataset
- 201130 Image Quality Assessment for Perceptual Image Restoration #score
- 201201 Weakly-Supervised Arbitrary-Shaped Text Detection with #ocr #weak_supervision
- 210601 Comparing Test Sets with Item Response Theory
- 200619 Denoising Diffusion Probabilistic Models
- 201214 Learning Energy-Based Models by Diffusion Recovery Likelihood #energy_based_model
- 210506 DiffSinger #singing_voice_synthesis
- 210511 Diffusion Models Beat GANs on Image Synthesis
- 210528 Gotta Go Fast When Generating Data with Score-Based Models
- 210531 On Fast Sampling of Diffusion Probabilistic Models
- 200130 ID-GAN #GAN
- 200130 MixNMatch #conditional_generative_model
- 200515 Face Identity Disentanglement via Latent Space Mapping
- 200129 Learning by Cheating
- 200209 Understanding and Improving Knowledge Distillation
- 200210 Subclass Distillation
- 200219 Knapsack Pruning with Inner Distillation #pruning #lightweight
- 200221 Residual Knowledge Distillation
- 200309 Knowledge distillation via adaptive instance normalization #normalization
- 200405 FastBERT #bert #lightweight
- 200408 DynaBERT #bert #pruning
- 200408 Improving BERT with Self-Supervised Attention #bert #self_supervised
- 200412 XtremeDistil #bert #lightweight
- 200521 Why distillation helps #calibration
- 200629 An EM Approach to Non-autoregressive Conditional Sequence Generation #non-autoregressive
- 200701 Go Wide, Then Narrow #lightweight
- 200702 Interactive Knowledge Distillation
- 200410 Longformer
- 200412 ProFormer
- 200605 Masked Language Modeling for Proteins via Linearly Scalable Long-Context
- 200608 Linformer
- 210324 Finetuning Pretrained Transformers into RNNs
- 210505 Beyond Self-attention
- 210510 Poolingformer
- 210603 Luna
- 200424 All Word Embeddings from One Embedding
- 200717 A Unifying Perspective on Neighbor Embeddings along the
- 200605 End-to-End Adversarial Text-to-Speech #tts
- 200608 FastSpeech 2 #tts
- 201106 Wave-Tacotron #tts
- 200504 How to Train Your Energy-Based Model for Regression
- 201124 Energy-Based Models for Continual Learning #continual_learning
- 200214 AutoLR #pruning
- 200426 Masking as an Efficient Alternative to Finetuning for Pretrained
- 200709 Sample-based Regularization #transfer
- 200220 Regularized Autoencoders via Relaxed Injective Probability Flow
- 200227 Woodbury Transformations for Deep Generative Flows
- 200122 CARAFE #resampling
- 200129 Mixture FPN
- 200506 Scale-Equalizing Pyramid Convolution for Object Detection
- 201201 Dynamic Feature Pyramid Networks for Object Detection
- 201202 Dual Refinement Feature Pyramid Networks for Object Detection
- 201202 Parallel Residual Bi-Fusion Feature Pyramid Network for Accurate
- 201225 Implicit Feature Pyramid Network for Object Detection #equilibrium_model #implicit_model
- 170629 Do GANs actually learn the distribution
- 191022 MelGAN #tts
- 200129 Adversarial Lipschitz Regularization
- 200129 GAN generalization metric
- 200129 OneGAN
- 200130 AttentionGAN #attention #img2img
- 200130 Evaluation metrics of GAN #metric #evaluation #generative_model
- 200130 Local GAN #attention
- 200130 Noise Robust GAN #robustness
- 200130 Small-GAN
- 200130 Smoothness and Stability in GANs
- 200206 Unbalanced GANs #vae
- 200210 Unsupervised Discovery of Interpretable Directions in the GAN Latent #semantic_factor
- 200211 Improved Consistency Regularization for GANs #augmentation #consistency_regularization
- 200211 Smoothness and Stability in GANs #regularization
- 200212 Image-to-Image Translation with Text Guidance #multimodal #multimodal_generation #img2img
- 200212 Real or Not Real, that is the Question
- 200214 Top-k Training of GANs #regularization
- 200220 The Benefits of Pairwise Discriminators for Adversarial Training #regularization
- 200223 GANHopper #img2img
- 200224 When Relation Networks meet GANs #regularization
- 200225 Freeze the Discriminator #finetuning #transfer
- 200226 On Leveraging Pretrained GANs for Generation with Limited Data #finetuning #transfer
- 200227 Topology Distance #topology #score
- 200228 A U-Net Based Discriminator for Generative Adversarial Networks
- 200304 Creating High Resolution Images with a Latent Adversarial Generator #generative_model #super_resolution
- 200308 Perceptual Image Super-Resolution with Progressive Adversarial Network #super_resolution
- 200312 Your GAN is Secretly an Energy-based Model and You Should use Discriminator Driven Latent Sampling #energy_based_model #sampling
- 200317 Blur, Noise, and Compression Robust Generative Adversarial Networks #noise
- 200318 OpenGAN #metric_learning
- 200325 Improved Techniques for Training Single-Image GANs #single_image
- 200326 Image Generation Via Minimizing Fréchet Distance in Discriminator Feature Space
- 200402 Controllable Orthogonalization in Training DNNs #regularization
- 200404 Feature Quantization Improves GAN Training #discrete_vae
- 200405 Discriminator Contrastive Divergence
- 200407 Inclusive GAN
- 200408 Attentive Normalization for Conditional Image Generation #attention
- 200504 Transforming and Projecting Images into Class-conditional Generative #generative_model
- 200518 Unconditional Audio Generation with Generative Adversarial Networks and Cycle Regularization #audio_generation
- 200519 CIAGAN
- 200519 Regularization Methods for Generative Adversarial Networks #review #regularization
- 200604 Image Augmentations for GAN Training #augmentation
- 200611 Training Generative Adversarial Networks with Limited Data #augmentation
- 200618 Differentiable Augmentation for Data-Efficient GAN Training #augmentation
- 200618 Diverse Image Generation via Self-Conditioned GANs #generative_model
- 200630 PriorGAN
- 200708 InfoMax-GAN #regularization
- 200713 Closed-Form Factorization of Latent Semantics in GANs #semantic_factor
- 200729 Instance Selection for GANs
- 200729 VocGAN #vocoder
- 200730 Rewriting a Deep Generative Model
- 200804 Open-Edit #image_editing
- 200807 Improving the Speed and Quality of GAN by Adversarial Training #robustness
- 201028 Training Generative Adversarial Networks by Solving Ordinary #neural_ode
- 201109 Learning Semantic-aware Normalization for Generative Adversarial Networks #normalization
- 201109 Towards a Better Global Loss Landscape of GANs #training
- 201118 Style Intervention #semantic_factor
- 201124 Adversarial Generation of Continuous Images #implicit_representation
- 201125 How to train your conditional GAN #img2img #generative_model
- 201125 Omni-GAN #generative_model
- 201127 Image Generators with Conditionally-Independent Pixel Synthesis #implicit_representation
- 201201 Refining Deep Generative Models via Discriminator Gradient Flow #sampling
- 201201 pi-GAN #implicit_representation
- 201203 Self-labeled Conditional GANs #unsupervised_training
- 201204 A Note on Data Biases in Generative Models #bias #generative_model
- 201208 You Only Need Adversarial Supervision for Semantic Image Synthesis #img2img
- 210227 Ultra-Data-Efficient GAN Training #augmentation #few_shot
- 210317 Training GANs with Stronger Augmentations via Contrastive Discriminator #contrastive_learning #augmentation
- 210318 Drop the GAN #single_image #generative_model #patch
- 210330 Dual Contrastive Loss and Attention for GANs #contrastive_learning
- 210401 Partition-Guided GANs
- 210407 Regularizing Generative Adversarial Networks under Limited Data #regularization
- 210408 InfinityGAN
- 210413 DatasetGAN #few_shot
- 210413 Few-shot Image Generation via Cross-domain Correspondence #img2img #generative_model #few_shot
- 210414 Aligning Latent and Image Spaces to Connect the Unconnectable
- 210415 GANcraft #nerf
- 210422 On Buggy Resizing Libraries and Surprising Subtleties in FID Calculation #antialiasing
- 210426 EigenGAN #semantic_factor
- 200331 In-Domain GAN Inversion for Real Image Editing
- 200703 Collaborative Learning for Faster StyleGAN Embedding
- 200130 Fantastic Generalization Measures
- 200225 Rethinking Bias-Variance Trade-off for Generalization of Neural Networks
- 190325 Implicit Generative and Generalization in Energy-Based Models #energy_based_model
- 200129 Controlling Generative Model
- 200129 Deep Automodulator
- 200129 Frechet Joint Distance
- 200129 Spot CNN generated image
- 200130 BIVA
- 200130 Glow #flow
- 200130 IGEBM #energy_based_model
- 200130 Neural Spline Flows #flow
- 200130 VQ-VAE-2 #autoregressive_model
- 200217 Augmented Normalizing Flows #flow
- 200313 Semantic Pyramid for Image Generation #perceptual_loss #image_editing
- 200616 Improved Techniques for Training Score-Based Generative Models #ncsn
- 201117 DeepNAG
- 201126 Score-Based Generative Modeling through Stochastic Differential #ddpm
- 201202 Improved Contrastive Divergence Training of Energy Based Models #energy_based_model
- 201204 Few-shot Image Generation with Elastic Weight Consolidation #few_shot #continual_learning
- 201209 Positional Encoding as Spatial Inductive Bias in GANs #positional_encoding
- 201224 Soft-IntroVAE #vae
- 210223 Zero-Shot Text-to-Image Generation #discrete_vae #autoregressive_model #multimodal
- 210302 Fixing Data Augmentation to Improve Adversarial Robustness #ddpm #augmentation
- 210305 Fixing Data Augmentation to Improve Adversarial Robustness 2 #robustness #augmentation #ddpm
- 210318 Few-shot Semantic Image Synthesis Using StyleGAN Prior #stylegan #few_shot
- 200722 WeightNet #channel_attention
- 200515 Semantic Photo Manipulation with a Generative Image Prior
- 201123 HistoGAN
- 210318 Using latent space regression to analyze and leverage compositionality
- 200130 FUNIT
- 200305 SketchyCOCO
- 200315 GMM-UNIT #multimodal_generation
- 200319 High-Resolution Daytime Translation Without Domain Labels
- 200330 Semi-supervised Learning for Few-shot Image-to-Image Translation #semi_supervised_learning #few_shot
- 200406 Rethinking Spatially-Adaptive Normalization #lightweight
- 200409 TuiGAN #few_shot #single_image
- 200419 TriGAN #domain_adaptation
- 200702 Deep Single Image Manipulation #single_image #image_editing
- 200709 Improving Style-Content Disentanglement in Image-to-Image Translation #disentangle
- 200714 COCO-FUNIT
- 200715 Transformation Consistency Regularization- A Semi-Supervised Paradigm #augmentation #semi_supervised_learning
- 200723 TSIT
- 200724 The Surprising Effectiveness of Linear Unsupervised Image-to-Image
- 201203 CoCosNet v2 #patch #pose
- 201205 Spatially-Adaptive Pixelwise Networks for Fast Image Translation #implicit_representation
- 210506 ACORN #positional_encoding
- 200129 BlendMask
- 200129 COCO 2018 Instance Segmentation #challenge
- 200129 Deep Snake
- 200130 PointRend
- 200311 Conditional Convolutions for Instance Segmentation
- 200313 PointINS #dynamic_conv
- 200722 Deep Variational Instance Segmentation
- 200730 LevelSet R-CNN
- 201119 DCT-Mask
- 201119 Unifying Instance and Panoptic Segmentation with Dynamic Rank-1 #panoptic_segmentation #dynamic_conv
- 201126 The Devil is in the Boundary
- 201129 End-to-End Video Instance Segmentation with Transformers #end2end #detr #video
- 201203 BoxInst #dataset #weak_supervision
- 210503 ISTR #end2end
- 210505 QueryInst #end2end
- 200424 Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order #mlm
- 200712 Do You Have the Right Scissors
- 200729 Mirostat
- 200128 Scaling Laws for LM
- 200205 K-Adapter #multitask #adapter
- 200206 Consistency of a Recurrent Language Model With Respect to Incomplete #decoding #hallucination #language_generation
- 200222 Training Question Answering Models From Synthetic Data #qa #bert
- 200225 MiniLM #distillation #lightweight
- 200406 Sparse Text Generation #language_generation #sampling
- 200427 Recall and Learn #finetuning #continual_learning
- 200505 Stolen Probability
- 200516 MicroNet for Efficient Language Modeling #lightweight
- 200518 Contextual Embeddings
- 201015 Fine-Tuning Pre-trained Language Model with Weak Supervision #transfer #weak_supervision
- 201023 Rethinking embedding coupling in pre-trained language models #regularization
- 201201 How Can We Know When Language Models Know #qa #calibration
- 201228 Universal Sentence Representation Learning with Conditional Masked #sentence_embedding #mlm
- 210216 Non-Autoregressive Text Generation with Pre-trained Language Models #non-autoregressive #text_generation
- 210318 GPT Understands, Too #finetuning #prompt
- 210407 Revisiting Simple Neural Probabilistic Language Models
- 210420 Carbon Emissions and Large Neural Network Training #nlp
- 200624 Neural Architecture Design for GPU-Efficient Networks
- 201124 MicroNet
- 210507 Pareto-Optimal Quantized ResNet Is Mostly 4-bit #quantization
- 210524 StructuralLM #layout
- 210528 ByT5
- 200221 Learning to Continually Learn #continual_learning
- 200312 Online Fast Adaptation and Knowledge Accumulation
- 200401 Editable Neural Networks
- 200402 Tracking by Instance Detection #tracking
- 200706 Meta-Learning Symmetries by Reparameterization #group_equivariance
- 210502 Larger-Scale Transformers for Multilingual Masked Language Modeling #multilingual #scale
- 200324 BigNAS
- 200326 Are Labels Necessary for Neural Architecture Search #unsupervised_training
- 200406 Network Adjustment
- 200412 FBNetV2
- 200428 Angle-based Search Space Shrinking for Neural Architecture Search
- 200506 Local Search is State of the Art for Neural Architecture Search
- 200507 Noisy Differentiable Architecture Search
- 200512 Neural Architecture Transfer #transfer
- 200602 FBNetV3 #hyperparameter #training #swa
- 200720 NSGANetV2
- 201014 NeRF++
- 201125 Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes
- 201127 D-NeRF
- 201203 Learned Initializations for Optimizing Coordinate-Based Neural #implicit_representation
- 201203 pixelNeRF
- 201215 Object-Centric Neural Scene Rendering
- 210225 IBRNet
- 210318 FastNeRF
- 210318 GNeRF
- 210318 MVSNeRF
- 210318 NeMI
- 210324 Mip-NeRF
- 210325 KiloNeRF
- 210325 PlenOctrees for Real-time Rendering of Neural Radiance Fields
- 200207 How to train your neural ODE
- 200520 Neural Controlled Differential Equations
- 200708 Learning Differential Equations that are Easy to Solve
- 200226 Learning to Shadow Hand-drawn Sketches
- 200427 Neural Hair Rendering
- 200506 CONFIG
- 201116 Stylized Neural Painting
- 201119 Creative Sketch Generation
- 201130 Animating Pictures with Eulerian Motion Fields #single_image
- 210319 Paint by Word
- 210512 Enhancing Photorealism Enhancement
- 200129 Meena #dialog
- 200518 (Re)construing Meaning in NLP
- 200715 Towards Debiasing Sentence Representations #bias
- 201117 Neural Semi-supervised Learning for Text Classification Under #self_supervised
- 200207 A Multilingual View of Unsupervised Machine Translation #multilingual
- 200427 Lexically Constrained Neural Machine Translation with Levenshtein Transformer
- 200710 Learn to Use Future Information in Simultaneous Translation #simultaneous_translation
- 201224 Why Neural Machine Translation Prefers Empty Outputs #hallucination
- 201223 Noisy Labels Can Induce Good Representations #representation
- 200403 Aligned Cross Entropy for Non-Autoregressive Machine Translation
- 200415 Non-Autoregressive Machine Translation with Latent Alignments #nmt #ctc
- 200422 A Study of Non-autoregressive Model for Sequence Generation
- 201022 Parallel Tacotron #vae
- 201025 Improved Mask-CTC for Non-Autoregressive End-to-End ASR #ctc
- 201125 FBWave #vocoder #lightweight
- 201207 EfficientTTS #tts
- 200310 ReZero is All You Need #initialization
- 200122 Group Norm, Weight Standardization
- 200122 Moving Average Batch Normalization
- 200122 StyleGAN 2 #GAN
- 200130 Rethinking Normalization
- 200130 Weight Standardization #weight
- 200224 Batch Normalization Biases Residual Blocks Towards the Identity Function #optimization #norm_free #initialization
- 200306 TaskNorm #meta_learning
- 200406 Evolving Normalization-Activation Layers #nas #activation
- 200427 A Batch Normalized Inference Network Keeps the KL Vanishing Away
- 201128 Batch Normalization with Enhanced Linear Transformation
- 191118 Anchor-Free
- 191118 CenterMask #instance_segmentation #backbone #1stage
- 191121 EfficientDet
- 200103 BlendMask #instance_segmentation #1stage
- 200122 SABL
- 200129 AP Loss #loss
- 200129 Backbone Reallocation for Detection #backbone #nas
- 200129 Dense RepPoints
- 200129 DetNAS #nas #backbone
- 200129 IOU-aware single stage detector #1stage
- 200130 ATSS #anchor #retinanet #fcos
- 200130 AutoAugment #augmentation #search
- 200130 EfficientDet #fpn
- 200130 Keypoint Triplet #keypoint
- 200130 Learning from Noisy Anchors
- 200130 Multiple Anchor Learning #anchor
- 200130 Objects as Points #keypoint
- 200130 Soft Anchor-Point #anchor
- 200211 Object Detection as a Positive-Unlabeled Problem #positive_unlabled #dataset
- 200212 Solving Missing-Annotation Object Detection with Background #dataset #noise
- 200218 Universal-RCNN #multi_dataset #graph
- 200316 Frustratingly Simple Few-Shot Object Detection #few_shot
- 200317 Revisiting the Sibling Head in Object Detector
- 200319 Revisiting the Sibling Head in Object Detector #review
- 200320 CentripetalNet #keypoint
- 200413 Dynamic R-CNN
- 200423 YOLOv4
- 200511 Scope Head for Accurate Localization in Object Detection
- 200526 End-to-End Object Detection with Transformers #end2end #matching
- 200603 DetectoRS
- 200611 Rethinking Pre-training and Self-training #semi_supervised_learning #transfer
- 200706 LabelEnc #distillation
- 200707 AutoAssign #anchor_free
- 200714 AQD #quantization
- 200715 Probabilistic Anchor Assignment with IoU Prediction for Object Detection #anchor #1stage
- 200716 RepPoints V2 #1stage #anchor_free
- 200723 PP-YOLO #tuning
- 200723 The Devil is in Classification #longtail
- 200727 Corner Proposal Network for Anchor-free, Two-stage Object Detection #anchor_free #2stage
- 201116 Scaled-YOLOv4
- 201117 UP-DETR #detr #end2end #pretraining
- 201118 End-to-End Object Detection with Adaptive Clustering Transformer #detr #end2end #efficiency
- 201121 Rethinking Transformer-based Set Prediction for Object Detection #detr #end2end #efficiency
- 201124 Sparse R-CNN
- 201128 Class-agnostic Object Detection
- 201207 End-to-End Object Detection with Fully Convolutional Network #end2end
- 201223 SWA Object Detection #swa
- 201227 Towards A Category-extended Object Detector without Relabeling or #continual_learning
- 210225 Simple multi-dataset detection #multi_dataset
- 210316 You Only Look One-level Feature
- 210325 USB #dataset
- 210417 TransVG #visual_grounding
- 210420 PP-YOLOv2 #yolo
- 210426 MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding #detr #visual_grounding
- 210601 You Only Look at One Sequence #vit
- 200221 The Break-Even Point on Optimization Trajectories of Deep Neural Networks #loss #training
- 200224 The Early Phase of Neural Network Training
- 200227 Using a thousand optimization tasks to learn hyperparameter search strategies #optimizer #hyperparameter
- 200228 A Self-Tuning Actor-Critic Algorithm #reinforcement_learning #hyperparameter #meta_learning
- 200316 Weak and Strong Gradient Directions
- 200403 Gradient Centralization #training
- 200508 An Investigation of Why Overparameterization Exacerbates Spurious #training
- 200519 One Size Fits All
- 200130 LAMB #large_batch
- 200509 Generalizing Outside the Training Set
- 200519 Bridging the Gap Between Training and Inference for Spatio-Temporal Forecasting
- 200129 Bridge gap of traininfer Panoptic Segmentation
- 200130 Panoptic-DeepLab
- 200218 Towards Bounding-Box Free Panoptic Segmentation #box_free
- 200404 Pixel Consensus Voting for Panoptic Segmentation
- 200421 Panoptic-based Image Synthesis #neural_rendering
- 201123 Scaling Wide Residual Networks for Panoptic Segmentation #scale
- 201201 Fully Convolutional Networks for Panoptic Segmentation #dynamic_conv
- 201201 MaX-DeepLab #detr #end2end
- 201202 Single-shot Path Integrated Panoptic Segmentation #dynamic_conv
- 200206 Image Fine-grained Inpainting #inpainting
- 200330 Exploiting Deep Generative Prior for Versatile Image Restoration and #gan_inversion
- 200515 Enhancing Perceptual Loss with Adversarial Feature Matching for Super-Resolution
- 200626 A Loss Function for Generative Neural Networks Based on Watson's
- 201223 Focal Frequency Loss for Image Reconstruction and Synthesis #loss
- 200729 Unselfie #inpainting
- 200628 Rethinking Positional Encoding in Language Pre-training
- 210408 Modulated Periodic Activations for Generalizable Local Functional #periodic_activation #implicit_representation
- 190620 XLNet #language_model
- 190729 RoBERTa #language_model
- 200128 mBART #machine_translation #nlp
- 200129 ImageBERT #multimodal
- 200129 LM Pretraining #nlp
- 200129 oLMpics #language_model #nlp
- 200130 RoBERTa #language_model #nlp #transformer
- 200130 T5 #nlp #transformer #seq2seq
- 200130 ViLBERT #multimodal
- 200210 Pre-training Tasks for Embedding-based Large-scale Retrieval #retrieval
- 200217 Incorporating BERT into Neural Machine Translation #language_model #bert #nmt
- 200219 CodeBERT #bert
- 200228 UniLMv2 #language_model
- 200317 Calibration of Pre-trained Transformers #calibration
- 200405 Unsupervised Domain Clusters in Pretrained Language Models #domain
- 200412 Pre-training Text Representations as Meta Learning #meta_learning #finetuning
- 200413 Pretrained Transformers Improve Out-of-Distribution Robustness #out_of_distribution
- 200419 Are we pretraining it right #multimodal
- 200420 Adversarial Training for Large Neural Language Models #adversarial_training #language_model #finetuning
- 200420 MPNet #language_model
- 200423 Don't Stop Pretraining #domain
- 200427 LightPAFF #distillation #finetuning
- 200520 Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models #contrastive_learning #sentence_embedding
- 200610 MC-BERT
- 200615 To Pretrain or Not to Pretrain #nlp #finetuning
- 200626 Pre-training via Paraphrasing #retrieval
- 200703 Language-agnostic BERT Sentence Embedding #embedding #multilingual
- 200713 An Empirical Study on Robustness to Spurious Correlations using #nlp #multitask
- 200715 InfoXLM #nlp #cross_lingual
- 200804 Taking Notes on the Fly Helps BERT Pre-training #nlp
- 201020 Pushing the Limits of Semi-Supervised Learning for Automatic Speech #semi_supervised_learning #asr
- 201021 Self-training and Pre-training are Complementary for Speech Recognition #self_supervised #asr
- 201022 mT5 #language_model #multilingual
- 201109 When Do You Need Billions of Words of Pretraining Data #language_model
- 201127 Progressively Stacking 2.0 #efficiency
- 201201 Pre-Trained Image Processing Transformer #contrastive_learning #vision_transformer #restoration
- 201201 StructFormer #parse #attention #mlm
- 201227 Syntax-Enhanced Pre-trained Model #language_model #syntax
- 210225 SparseBERT #attention #sparse_attention #bert
- 210318 All NLP Tasks Are Generation Tasks #language_model
- 210324 Can Vision Transformers Learn without Natural Images #vision_transformer
- 210402 Robust wav2vec 2.0 #asr
- 210407 Pushing the Limits of Non-Autoregressive Speech Recognition #non-autoregressive #asr #ctc
- 210413 Masked Language Modeling and the Distributional Hypothesis #language_model #mlm
- 210417 mT6 #language_model
- 210418 Data-Efficient Language-Supervised Zero-Shot Learning with #multimodal
- 210422 ImageNet-21K Pretraining for the Masses #backbone
- 210510 Are Pre-trained Convolutions Better than Pre-trained Transformers #nlp #convolution #transformer
- 200130 Rethinking Pruning
- 200218 Picking Winning Tickets Before Training by Preserving Gradient Flow #lottery_ticket
- 200224 HRank #rank
- 200305 Comparing Rewinding and Fine-tuning in Neural Network Pruning
- 200424 Convolution-Weight-Distribution Assumption
- 200514 Bayesian Bits #quantization #variational_inference
- 200515 Movement Pruning
- 200518 Joint Multi-Dimension Pruning
- 200706 Lossless CNN Channel Pruning via Decoupling Remembering and Forgetting
- 200710 To Filter Prune, or to Layer Prune, That Is The Question
- 200130 DropAttention #dropout
- 200219 Revisiting Training Strategies and Generalization Performance in Deep #metric_learning
- 200225 On Feature Normalization and Data Augmentation #normalization #mixup
- 200228 The Implicit and Explicit Regularization Effects of Dropout #dropout
- 200331 Regularizing Class-wise Predictions via Self-knowledge Distillation #distillation #consistency_regularization
- 200409 Orthogonal Over-Parameterized Training
- 200424 Dropout as an Implicit Gating Mechanism For Continual Learning
- 200427 Scheduled DropHead
- 200506 RNN-T Models Fail to Generalize to Out-of-Domain Audio #transducer #out_of_distribution #domain #asr
- 200513 Implicit Regularization in Deep Learning May Not Be Explainable by Norms #training #optimization
- 200707 RIFLE #finetuning
- 200707 Remix #imbalanced
- 200721 Improving compute efficacy frontiers with SliceOut #efficient_training
- 201122 Stable Weight Decay Regularization
- 210603 When Vision Transformers Outperform ResNets without Pretraining or Strong Data Augmentations #vit
- 191120 Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
- 200130 Mastering Atari, Go, Chess, Shogi
- 200626 Critic Regularized Regression
- 200402 Learning to See Through Obstructions
- 200404 Deblurring by Realistic Blurring
- 200406 Self-Supervised Scene De-occlusion
- 200420 Bringing Old Photos Back to Life #vae
- 201123 Cross-Camera Convolutional Color Constancy
- 201123 Dissecting Image Crops
- 191210 Thoughts on recent papers
- 200130 Filter Response Normalization
- 200227 A Primer in BERTology #bert
- 200306 What is the State of Neural Network Pruning #pruning
- 200311 Improved Baselines with Momentum Contrastive Learning #contrastive_learning
- 200318 A Metric Learning Reality Check #metric_learning
- 200323 Thoughts on recent papers
- 200324 A Systematic Evaluation
- 200325 Rethinking Few-Shot Image Classification #meta_learning
- 200326 Thoughts on recent papers
- 200403 Thoughts on recent papers
- 200408 State of the Art on Neural Rendering #neural_rendering
- 200409 EvoNorm
- 200411 Thoughts on recent papers
- 200428 Showing Your Work Doesn't Always Work
- 200619 Augmentation for GANs
- 200627 Denoising Diffusion Probabilistic Models Implementation
- 200708 Thoughts on recent papers
- 200717 Semantic factor of GANs
- 200717 Thoughts on recent papers
- 200725 Neighbor Embedding
- 200726 Thoughts on recent papers
- 200802 Thoughts on recent papers
- 200821 Virtual Try On
- 201016 Representation Learning via Invariant Causal Mechanisms
- 201021 BYOL works even without batch statistics
- 201108 Long Range Arena #attention #efficient_attention
- 201112 Learning Semantic-aware Normalization for Generative Adversarial Networks
- 201112 When Do You Need Billions of Words of Pretraining Data
- 201118 Thoughts on recent papers
- 201120 Thoughts on recent papers
- 201125 Thoughts on recent papers
- 201126 Thoughts on recent papers 1
- 201126 Thoughts on recent papers 2
- 201204 Thoughts on recent papers
- 210121 Thoughts on recent papers
- 210227 Thoughts on recent papers
- 210305 Thoughts on recent papers
- 210319 Thoughts on recent papers
- 210323 Thoughts on recent papers
- 210324 A Broad Study on the Transferability of Visual Representations with Contrastive Learning #contrastive_learning
- 210325 Contrasting Contrastive Self-Supervised Representation Learning Models #contrastive_learning
- 210326 Thoughts on recent papers
- 210403 Thoughts on recent papers
- 210412 Thoughts on recent papers
- 210424 Thoughts on recent papers
- 210429 Thoughts on recent papers
- 210430 Thoughts on recent papers 1
- 210430 Thoughts on recent papers 2
- 210505 Thoughts on recent papers
- 210508 Thoughts on recent papers
- 210512 When Does Contrastive Visual Representation Learning Work #contrastive_learning #self_supervised #transfer
- 200211 Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial #adversarial_training
- 200304 A Closer Look at Accuracy vs. Robustness #adversarial_training
- 200810 Informative Dropout for Robust Representation Learning
- 210521 Intriguing Properties of Vision Transformers #vit
- 200712 Learning to Learn Parameterized Classification Networks for Scalable #hypernetwork
- 201130 Towards Better Accuracy-efficiency Trade-offs
- 200213 Automatically Discovering and Learning New Visual Categories with Ranking Statistics #weak_supervision
- 200218 MAST #tracking
- 200224 Self-Adaptive Training #noise #dataset
- 200722 CrossTransformers #few_shot
- 201015 Representation Learning via Invariant Causal Mechanisms #causality
- 201224 Self-supervised Pre-training with Hard Examples Improves Visual #mixup
- 200403 Self-Supervised Viewpoint Learning From Image Collections #viewpoint
- 201127 Unsupervised part representation by Flow Capsules
- 210429 MarioNette
- 200307 StyleGAN2 Distillation for Feed-forward Image Manipulation #stylegan
- 200308 PULSE #stylegan
- 200406 GANSpace
- 201127 Navigating the GAN Parameter Space for Semantic Image Editing #image_editing
- 201222 Time-Travel Rephotography #restoration #stylegan
- 200323 Learning Dynamic Routing for Semantic Segmentation
- 200516 Single-Stage Semantic Segmentation from Image Labels
- 200826 EfficientFCN
- 210512 Segmenter
- 200218 DivideMix #mixup #noise #dataset
- 200306 Semi-Supervised StyleGAN for Disentanglement Learning #stylegan #mixup
- 200323 Meta Pseudo Labels #meta_learning
- 200627 Laplacian Regularized Few-Shot Learning #few_shot
- 200724 Deep Co-Training with Task Decomposition for Semi-Supervised Domain #domain_adaptation
- 201116 On the Marginal Benefit of Active Learning #active_learning #unsupervised_training
- 201118 FROST
- 200129 Speech Recognition
- 200129 WaveFlow #conditional_generative_model
- 200318 A Content Transformation Block For Image Style Transfer
- 200324 Deformable Style Transfer
- 200710 Geometric Style Transfer
- 200803 Encoding in Style #gan_inversion
- 210318 Labels4Free #unsupervised_segmentation
- 200130 Unlikelihood Training
- 200601 Cascaded Text Generation with Markov Transformers #decoding
- 200605 CoCon
- 200402 Tracking Objects as Points #keypoint
- 200403 FairMOT
- 200506 PeTra
- 201215 Detecting Invisible People
- 200130 BiT ResNet #resnet
- 200711 Adversarially-Trained Deep Nets Transfer Better #adversarial_training
- 200716 Do Adversarially Robust ImageNet Models Transfer Better #robust
- 200721 Adversarial Training Reduces Information and Improves Transferability #adversarial_training
- 201122 Ranking Neural Checkpoints
- 200129 Are Transformers universal approximator
- 200129 Product Key Memory #attention
- 200129 Reformer #attention
- 200130 Sparse Transformer #generative_model
- 200130 Structured Pruning for LM #pruning
- 200207 Transformer Transducer #asr #transducer
- 200211 On Layer Normalization in the Transformer Architecture #normalization
- 200212 GLU Variants Improve Transformer #activation
- 200214 Transformer on a Diet #efficient_attention
- 200214 Transformers as Soft Reasoners over Language #language
- 200215 Fine-Tuning Pretrained Language Models #bert #finetuning
- 200221 Addressing Some Limitations of Transformers with Feedback Memory #recurrent
- 200305 Talking-Heads Attention #attention
- 200424 Lite Transformer with Long-Short Range Attention #lightweight
- 200515 Finding Experts in Transformer Models
- 200515 JDI-T #tts
- 200516 Conformer #asr
- 200518 Weak-Attention Suppression For Transformer Based Speech Recognition #asr
- 200605 Funnel-Transformer #efficient_attention
- 200707 Do Transformers Need Deep Long-Range Memory #lm #attention
- 200709 Fast Transformers with Clustered Attention #attention
- 200715 AdapterHub #nlp #finetuning
- 200727 Big Bird #attention
- 200802 DeLighT #nlp
- 201217 Taming Transformers for High-Resolution Image Synthesis #discrete_vae #generative_model #autoregressive_model
- 201221 RealFormer #attention
- 201227 SG-Net #syntax #attention
- 210223 Do Transformer Modifications Transfer Across Implementations and
- 210225 Evolving Attention with Residual Convolutions #attention
- 210318 HiT #video #retrieval
- 210318 Looking Beyond Two Frames #tracking
- 210318 TFPose #pose
- 210318 TransCenter #tracking
- 210318 Transformer Trackin #tracking
- 210407 Seeing Out of tHe bOx #multimodal #vision-language
- 210409 Efficient Large-Scale Language Model Training on GPU Clusters #distributed_training
- 210409 Not All Attention Is All You Need
- 210410 UniDrop #regularization
- 210417 Demystifying the Better Performance of Position Encoding Variants for #positional_encoding
- 210420 RoFormer #positional_encoding
- 210423 M3DeTR #3d
- 210509 FNet #efficient_attention #fourier
- 200512 Flowtron #flow
- 200310 Unpaired Image-to-Image Translation using Adversarial Consistency Loss
- 200611 Rethinking the Truly Unsupervised Image-to-Image Translation
- 201201 Unpaired Image-to-Image Translation via Latent Energy Transport
- 200707 NVAE
- 201119 Dual Contradistinctive Generative Autoencoder
- 201120 Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them
- 201127 General Multi-label Image Classification with Transformers
- 201223 A Survey on Visual Transformer
- 201223 Training data-efficient image transformers & distillation through #distillation
- 210223 Pyramid Vision Transformer
- 210318 CrossViT
- 210318 CvT
- 210318 Multi-Scale Vision Longformer
- 210319 ConViT
- 210319 Scalable Visual Transformers with Hierarchical Pooling
- 210324 Vision Transformers for Dense Prediction #fpn
- 210325 Swin Transformer #local_attention
- 210331 Going deeper with Image Transformers
- 210402 LeViT
- 210421 Token Labeling
- 210422 Multiscale Vision Transformers
- 210422 So-ViT
- 210426 Improve Vision Transformers Training by Suppressing Over-smoothing
- 210426 Visformer
- 210427 ConTNet
- 210428 Twins #local_attention #positional_encoding
- 210509 Conformer
- 210515 Are Convolutional Neural Networks or Transformers more like human vision #cnn #inductive_bias
- 210517 Rethinking the Design Principles of Robust Vision Transformer #robustness
- 210526 Aggregating Nested Transformers #local_attention
- 210529 Less is More
- 210603 DynamicViT #sparse_attention