Neural Network
Benchmark
Word Embedding
Security
Explainable Artificial Intelligence (XAI)
Language Model
Computer Vision
Reinforcement Learning
Information Retrieval
Tabular Learning
Meta Learning
Continual Learning
Mixture of Experts
Model Compression
Knoweldge Distillation
Quantization
Theme | Number | Title | Journal/Conference | Date | Author | Link |
---|---|---|---|---|---|---|
Neural Network |
1 | Random Forests | Machine Learning, Volume 45 | 2001-01-01 | Leo Breiman | Link |
2 | Visualizing Data using t-SNE | ICLR 2015 | 2008-01-01 | Laurens van der Maaten et al | Link | |
3 | Practical Bayesian Optimization of Machine Learning Algorithms | 2012-06-13 | Jasper Snoek et al | Link | ||
4 | Dropout: A Simple Way to Prevent Neural Networks from Overfitting | JMLR 2014 | 2014-01-01 | Nitish Srivastava et al | Link | |
5 | Adam: A Method for Stochastic Optimization | ACL 2015 | 2014-12-22 | Diederik P. Kingma et al | Link | |
6 | Convolutional Neural Networks for Sentence Classification | EMNLP 2014 | 2014-08-25 | Yoon Kim et al | Link | |
7 | Efficient Per-Example Gradient Computations | ICML 2016 | 2015-10-07 | Ian Goodfellow et al | Link | |
8 | XGBoost: A Scalable Tree Boosting System | ACM CCS 2016 | 2016-03-09 | Tianqi Chen et al | Link | |
9 | Permutation Invariant Training of Deep Models for Speaker-Independent Multi-talker Speech Separation | 2016-07-01 | Dong Yu et al | Link | ||
10 | SGDR: Stochastic Gradient Descent with Warm Restarts | IEEE 2017 | 2016-08-13 | Ilya Loshchilov et al | Link | |
11 | A Corpus of Natural Language for Visual Reasoning | ACL 2018 | 2017-01-01 | Alane Suh et al | Link | |
12 | Inductive Representation Learning on Large Graphs | NIPS 2017 | 2017-06-07 | William L. Hamilton et al | Link | |
13 | Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour | NeurIPS 2017 | 2017-06-08 | Priya Goyal et al | Link | |
14 | Don't Decay the Learning Rate, Increase the Batch Size | ICLR 2018 | 2017-11-01 | Samuel L. Smith et al | Link | |
15 | Population Based Training of Neural Networks | ACL 2018 | 2017-11-27 | Samuel L. Smith et al | Link | |
16 | Self-Attention with Relative Position Representations | NAACL 2018 | 2018-03-06 | Peter Shaw et al | Link | |
17 | A disciplined approach to neural network hyper-parameters: Part 1 -- learning rate, batch size, momentum, and weight decay | PBML | 2018-03-26 | Leslie N. Smith et al | Link | |
18 | Iterative search for weakly supervised semantic parsing | NAACL 2019 | 2019-01-01 | Pradeep Dasigi et al | Link | |
19 | Dying ReLU and Initialization: Theory and Numerical Examples | 2019-03-15 | Lu Lu et al | Link | ||
20 | MGAT: Multi-view Graph Attention Networks | Neural Networks 2020 | 2020-01-01 | Yu Xie et al | Link | |
21 | Permute, Quantize, and Fine-tune: Efficient Compression of Neural Networks | PMLR 2021 | 2020-10-29 | Julieta Martinez et al | Link | |
22 | Ensemble deep learning: A review | 2021-04-06 | M. A. Ganaie et al | Link | ||
23 | Do Transformers Really Perform Bad for Graph Representation? | NeurIPS 2021 | 2021-06-09 | Chengxuan Ying et al | Link | |
24 | R-Drop: Regularized Dropout for Neural Networks | ICML 2016 | 2021-06-28 | Xiaobo Liang et al | Link | |
25 | GRPE: Relative Positional Encoding for Graph Transformer | ICLR 2022 | 2022-01-30 | Wonpyo Park et al | Link | |
26 | Visual Instruction Tuning | EMNLP 2023 | 2023-04-17 | Haotian Liu et al | Link | |
Benchmark | 27 | BLEU: a method for automatic evaluation of machine translation | ACL 2002 | 2002-07-01 | Kishore Papineni et al | Link |
28 | METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments | ACL 2005 | 2005-01-01 | Satanjeev Banerjee et al | Link | |
29 | e-SNLI: Natural Language Inference with Natural Language Explanations | NeurIPS 2018 | 2018-01-01 | Oana-Maria Camburu et al | Link | |
30 | GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding | 2018-04-20 | Alex Wang et al | Link | ||
31 | A Call for Clarity in Reporting BLEU Scores | ACL 2018 | 2018-04-23 | Matt Post et al | Link | |
32 | CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge | NAACL 2019 | 2018-11-02 | Alon Talmor et al | Link | |
33 | DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs | NAACL 2019 | 2019-03-01 | Dheeru Dua et al | Link | |
34 | BERTScore: Evaluating Text Generation with BERT | ACL 2020 | 2019-04-21 | Tianyi Zhang et al | Link | |
35 | MathQA: Towards Interpretable Math Word Problem Solving with Operation-Based Formalisms | NAACL 2019 | 2019-05-30 | Aida Amini et al | Link | |
36 | QASC: A Dataset for Question Answering via Sentence Composition | AAAI 2020 | 2019-10-25 | Tushar Khot et al | Link | |
37 | BLEURT: Learning Robust Metrics for Text Generation | ACL 2020 | 2020-04-09 | Thibault Sellam et al | Link | |
38 | Did Aristotle Use a Laptop? A Question Answering Benchmark with Implicit Reasoning Strategies | TACL 2021 | 2021-01-06 | Mor Geva et al | Link | |
39 | TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance | ACL 2021 | 2021-05-17 | Fengbin Zhu et al | Link | |
40 | KLUE: Korean Language Understanding Evaluation | 2021-05-20 | Sungjoon Park et al | Link | ||
41 | BARTScore: Evaluating Generated Text as Text Generation | NeurIPS 2021 | 2021-06-22 | Weizhe Yuan et al | Link | |
42 | FinQA: A Dataset of Numerical Reasoning over Financial Data | EMNLP 2021 | 2021-09-01 | Zhiyu Chen et al | Link | |
43 | CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge | NeurIPS 2021 | 2021-09-03 | Yasumasa Onoe et al | Link | |
44 | ROUGE: A Package for Automatic Evaluation of Summaries | Artificial Intelligenc Volume 299 | 2021-10-01 | Chin-Yew Lin et al | Link | |
45 | ConvFinQA: Exploring the Chain of Numerical Reasoning in Conversational Finance Question Answering | EMNLP 2022 | 2022-10-07 | Zhiyu Chen et al | Link | |
Word Embedding | 46 | Efficient Estimation of Word Representations in Vector Space | 2013-01-16 | Tomas Mikolov et al | Link | |
47 | Linguistic Regularities in Continuous Space Word Representations | 2013-06-01 | Tomas Mikolov et al | Link | ||
48 | Dear Sir or Madam, May I introduce the GYAFC Dataset: Corpus, Benchmarks and Metrics for Formality Style Transfer | 2018-03-17 | Sudha Rao et al | Link | ||
49 | SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing | ACL 2019 | 2018-08-19 | Taku Kudo et al | Link | |
50 | Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks | AAAI 2020 | 2019-08-27 | Nils Reimers et al | Link | |
51 | SimCSE: Simple Contrastive Learning of Sentence Embeddings | EMNLP 2021 | 2021-04-18 | Tianyu Gao et al | Link | |
52 | Self-Guided Contrastive Learning for BERT Sentence Representations | ACL 2021 | 2021-06-03 | Taeuk Kim et al | Link | |
53 | Contrastive Learning of Sentence Embeddings from Scratch | 2023-05-24 | Junlei Zhang et al | Link | ||
Explainable Artificial Intelligence (XAI) | 54 | Building Machines That Learn and Think Like People | NeurIPS 2016 | 2016-04-01 | Brenden M. Lake et al | Link |
55 | Towards A Rigorous Science of Interpretable Machine Learning | 2017-02-28 | Finale Doshi-Velez et al | Link | ||
56 | A Multiscale Visualization of Attention in the Transformer Model | ACL 2019 | 2019-06-12 | Jesse Vig et al | Link | |
57 | On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines | IEEE Open Journal of the Computer Society | 2020-06-08 | Marius Mosbach et al | Link | |
58 | Reliable Post hoc Explanations: Modeling Uncertainty in Explainability | 2020-08-11 | Dylan Slack et al | Link | ||
59 | Reward is enough | ICLR 2022 | 2021-10-01 | David Silver et al | Link | |
Language Model | 60 | Minimum Bayes-Risk Decoding for Statistical Machine Translation | ACL 2011 | 2004-01-01 | Shankar Kumar et al | Link |
61 | Large-Scale Distributed Language Modeling | IEEE 2007 | 2007-04-05 | Ahmad Emami et al | Link | |
62 | Large Language Models in Machine Translation | Teaching and Learning in Higher Education | 2007-06-01 | Gloria Brown Wright et al | Link | |
63 | Learning Dependency-Based Compositional Semantics | NerulIPS 2012 | 2011-09-30 | Percy Liang et al | Link | |
64 | Neural Machine Translation by Jointly Learning to Align and Translate | NeurIPS 2014 | 2014-09-01 | Dzmitry Bahdanau et al | Link | |
65 | Sequence to Sequence Learning with Neural Networks | ICML 2015 | 2014-09-10 | Ilya Sutskever et al | Link | |
66 | Neural Machine Translation of Rare Words with Subword Units | ICLR 2016 | 2015-08-31 | Rico Sennrich et al | Link | |
67 | Continuous control with deep reinforcement learning | 2015-09-09 | Timothy P. Lillicrap et al | Link | ||
68 | Unsupervised Deep Embedding for Clustering Analysis | 2015-11-19 | Junyuan Xie et al | Link | ||
69 | Hierarchical Attention Networks for Document Classification | KDD 2016 | 2016-01-01 | Zichao Yang et al | Link | |
70 | Hierarchical Attention Networks for Document Classification | 2016-01-01 | Zichao Yang et al | Link | ||
71 | Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation | 2016-09-26 | Yonghui Wu et al | Link | ||
72 | Understanding deep learning requires rethinking generalization | ICLR 2017 | 2016-11-10 | Chiyuan Zhang et al | Link | |
73 | SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents | AAAI 2017 | 2016-11-14 | Ramesh Nallapati et al | Link | |
74 | Deep Biaffine Attention for Neural Dependency Parsing | ICLR 2017 | 2016-11-06 | Timothy Dozat et al | Link | |
75 | Search-based Neural Structured Learning for Sequential Question Answering | ACL 2017 | 2017-01-01 | Mohit Iyyer et al | Link | |
76 | Reading Wikipedia to Answer Open-Domain Questions | ACL 2017 | 2017-03-31 | Danqi Chen et al | Link | |
77 | Get To The Point: Summarization with Pointer-Generator Networks | 2017-04-14 | Abigail See et al | Link | ||
78 | Learning to Ask: Neural Question Generation for Reading Comprehension | ACL 2017 | 2017-04-29 | Xinya Du et al | Link | |
79 | Style Transfer from Non-Parallel Text by Cross-Alignment | 2017-05-26 | Tianxiao Shen et al | Link | ||
80 | Attention Is All You Need | 2017-06-12 | Ashish Vaswani et al | Link | ||
81 | Adversarial Examples for Evaluating Reading Comprehension Systems | 2017-07-23 | Robin Jia et al | Link | ||
82 | Self-Attention with Relative Position Representations | NAACL 2018 | 2018-01-01 | Peter Shaw et al | Link | |
83 | Personalizing Dialogue Agents: I have a dog, do you have pets too? | WMT 2018 | 2018-01-22 | Saizheng Zhang et al | Link | |
84 | Deep contextualized word representations | 2018-02-15 | Matthew E. Peters et al | Link | ||
85 | Training Tips for the Transformer Model | 2018-04-01 | Martin Popel et al | Link | ||
86 | Hierarchical Neural Story Generation | 2018-05-13 | Angela Fan et al | Link | ||
87 | Know What You Don't Know: Unanswerable Questions for SQuAD | ACL 2018 | 2018-06-11 | Pranav Rajpurkar et al | Link | |
88 | Know What You Don't Know: Unanswerable Questions for SQuAD | AAAI 2019 | 2018-06-11 | Pranav Rajpurkar et al | Link | |
89 | Read + Verify: Machine Reading Comprehension with Unanswerable Questions | 2018-08-17 | Minghao Hu et al | Link | ||
90 | BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | 2018-10-11 | Jacob Devlin et al | Link | ||
91 | A Recurrent BERT-based Model for Question Generation | ACL 2019 | 2019-01-01 | Ying-Hong Chan et al | Link | |
92 | A Goal-Driven Tree-Structured Neural Model for Math Word Problems | IJCAI 2019 | 2019-01-01 | Zhipeng Xie et al | Link | |
93 | EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks | 2019-01-31 | Jason Wei et al | Link | ||
94 | Good-Enough Compositional Data Augmentation | 2019-04-21 | Jacob Andreas et al | Link | ||
95 | The Curious Case of Neural Text Degeneration | 2019-04-22 | Ari Holtzman et al | Link | ||
96 | Language Models are Unsupervised Multitask Learners | 2019-06-01 | Alec Radford et al | Link | ||
97 | Explain Yourself! Leveraging Language Models for Commonsense Reasoning | ACL 2019 | 2019-06-06 | Nazneen Fatema Rajani et al | Link | |
98 | SpanBERT: Improving Pre-training by Representing and Predicting Spans | 2019-07-24 | Mandar Joshi et al | Link | ||
99 | Neural Text Generation with Unlikelihood Training | 2019-08-12 | Sean Welleck et al | Link | ||
100 | TabNet: Attentive Interpretable Tabular Learning | EMNLP 2019 | 2019-08-20 | Sercan O. Arik et al | Link | |
101 | Text Summarization with Pretrained Encoders | 2019-08-22 | Yang Liu et al | Link | ||
102 | Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks | 2019-08-27 | Nils Reimers et al | Link | ||
103 | Alpaca: Intermittent Execution without Checkpoints | NeurIPS 2019 | 2019-09-13 | Kiwan Maeng et al | Link | |
104 | Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer | ACL 2020 | 2019-10-23 | Colin Raffel et al | Link | |
105 | BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension | 2019-10-29 | Mike Lewis et al | Link | ||
106 | Masked Language Model Scoring | ACL 2020 | 2019-10-31 | Julian Salazar et al | Link | |
107 | Fast Transformer Decoding: One Write-Head is All You Need | 2019-11-06 | Noam Shazeer et al | Link | ||
108 | Improving Transformer Optimization Through Better Initialization | NeurIPS 2020 | 2020-01-01 | Xiao Shi Huang et al | Link | |
109 | A Learning-Exploring Method to Generate Diverse Paraphrases with Multi-Objective Deep Reinforcement Learning | Coling 2020 | 2020-01-01 | Mingtong Liu et al | Link | |
110 | Data Augmentation using Pre-trained Transformer Models | AACL 2020 | 2020-03-04 | Varun Kumar et al | Link | |
111 | ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators | ICLR 2020 | 2020-03-23 | Kevin Clark et al | Link | |
112 | Dense Passage Retrieval for Open-Domain Question Answering | EMNLP 2020 | 2020-04-10 | Vladimir Karpukhin et al | Link | |
113 | Understanding the Difficulty of Training Transformers | 2020-04-17 | Liyuan Liu et al | Link | ||
114 | Efficient Second-Order TreeCRF for Neural Dependency Parsing | ACL 2020 | 2020-05-03 | Yu Zhang et al | Link | |
115 | Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language? | EMNLP Findings 2020 | 2020-10-08 | Peter Hase et al | Link | |
116 | Answer Span Correction in Machine Reading Comprehension | EMNLP 2020 | 2020-11-06 | Revanth Gangi Reddy et al | Link | |
117 | Learning by Fixing: Solving Math Word Problems with Weak Supervision | AAAI 2021 | 2020-12-19 | Yining Hong et al | Link | |
118 | An Edge-Enhanced Hierarchical Graph-to-Tree Network for Math Word Problem Solving | EMNLP Findings 2021 | 2021-01-01 | Qinzhuo Wu et al | Link | |
119 | RoFormer: Enhanced Transformer with Rotary Position Embedding | 2021-04-20 | Jianlin Su et al | Link | ||
120 | SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization | ACL 2021 | 2021-06-03 | Yixin Liu et al | Link | |
121 | Finetuned Language Models Are Zero-Shot Learners | IEEE/RJS International Conference on Intelligent RObots and Systems | 2021-09-03 | Jason Wei et al | Link | |
122 | DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing | 2021-11-18 | Pengcheng He et al | Link | ||
123 | Post-Training with Interrogative Sentences for Enhancing BART-based Korean Question Generator | ACL 2022 | 2022-01-01 | Gyu-Min Park et al | Link | |
124 | FNet: Mixing Tokens with Fourier Transforms | NAACL 2022 | 2022-01-01 | James Lee-Thorp et al | Link | |
125 | BERTopic: Neural topic modeling with a class-based TF-IDF procedure | 2022-03-11 | Maarten Grootendorst et al | Link | ||
126 | SummaReranker: A Multi-Task Mixture-of-Experts Re-ranking Framework for Abstractive Summarization | ACL 2022 | 2022-03-13 | Mathieu Ravaut et al | Link | |
127 | GPT-NeoX-20B: An Open-Source Autoregressive Language Model | BigScience 2022 | 2022-04-14 | Sid Black et al | Link | |
128 | Self-Consistency Improves Chain of Thought Reasoning in Language Models | ICLR 2023 | 2022-03-21 | Xuezhi Wang et al | Link | |
129 | Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks | EMNLP 2022 | 2022-04-16 | Yizhong Wang et al | Link | |
130 | MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification | ACL 2020 | 2022-04-25 | Jiaao Chen et al | Link | |
131 | Learning to Transfer Prompts for Text Generation | NAACL 2022 | 2022-05-03 | Junyi Li et al | Link | |
132 | On the Use of BERT for Automated Essay Scoring: Joint Learning of Multi-Scale Essay Representation | EMNLP 2022 | 2022-05-08 | Yongjie Wang et al | Link | |
133 | KOLD: Korean Offensive Language Dataset | 2022-05-23 | Younghoon Jeong et al | Link | ||
134 | CoNT: Contrastive Neural Text Generation | 2022-05-29 | Chenxin An et al | Link | ||
135 | CoNT: Contrastive Neural Text Generation | ICLR 2022 | 2022-05-29 | Chenxin An et al | Link | |
136 | GODEL: Large-Scale Pre-Training for Goal-Directed Dialog | 2022-06-22 | Baolin Peng et al | Link | ||
137 | Answering Numerical Reasoning Questions in Table-Text Hybrid Contents with Graph-based Encoder and Tree-based Decoder | COLING 2022 | 2022-09-16 | Fangyu Lei et al | Link | |
138 | ReAct: Synergizing Reasoning and Acting in Language Models | 2022-10-06 | Shunyu Yao et al | Link | ||
139 | Generative Language Models for Paragraph-Level Question Generation | EMNLP 2022 | 2022-10-08 | Asahi Ushio et al | Link | |
140 | Explanations from Large Language Models Make Small Reasoners Better | 2022-10-13 | Shiyang Li et al | Link | ||
141 | NoisyQuant: Noisy Bias-Enhanced Post-Training Activation Quantization for Vision Transformers | CVPR 2023 | 2022-11-29 | Yijiang Liu et al | Link | |
142 | Self-Instruct: Aligning Language Models with Self-Generated Instructions | ACL 2023 | 2022-12-20 | Yizhong Wang et al | Link | |
143 | Dialog-Post Multi-Level Self-Supervised Objectives and Hierarchical Model for Dialogue Post-Training | ACL 2023 | 2023-01-01 | Zhenyu Zhang et al | Link | |
144 | LLaMA: Open and Efficient Foundation Language Models | 2023-02-27 | Hugo Touvron et al | Link | ||
145 | CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society | NeurIPS 2023 | 2023-03-31 | Guohao Li et al | Link | |
146 | Instruction Tuning with GPT-4 | EMNLP 2023 | 2023-04-06 | Baolin Peng et al | Link | |
147 | Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting | 2023-05-07 | Miles Turpin et al | Link | ||
148 | Improving Factuality and Reasoning in Language Models through Multiagent Debate | 2023-05-23 | Yilun Du et al | Link | ||
149 | An Empirical Comparison of LM-based Question and Answer Generation Methods | ACL 2023 | 2023-05-26 | Asahi Ushio et al | Link | |
150 | A Practical Toolkit for Multilingual Question and Answer Generation | 2023-05-27 | Asahi Ushio et al | Link | ||
151 | LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion | ACL 2023 | 2023-06-05 | Dongfu Jiang et al | Link | |
152 | LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models | NAACL 2024 | 2023-09-21 | Yukang Chen et al | Link | |
153 | Mistral 7B | 2023-10-10 | Albert Q. Jiang et al | Link | ||
154 | Gemma: Open Models Based on Gemini Research and Technology | 2024-03-13 | Gemma Team et al | Link | ||
155 | An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing | ACL 2024 | 2024-03-25 | Ziwei Chai et al | Link | |
Meta Learning | 156 | Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks | ICML 2017 | 2017-03-09 | Chelsea Finn et al | Link |
157 | Optimization as A Model for Few-shot Learning | ICLR 2017 | 2017-07-22 | Sachin Ravi et al | Link | |
158 | BERT Learns to Teach: Knowledge Distillation with Meta Learning | 2021-06-08 | Wangchunshu Zhou et al | Link | ||
Continual Learning | 159 | Overcoming catastrophic forgetting in neural networks | 2016-12-02 | James Kirkpatrick et al | Link | |
Mixture of Experts | 160 | Adaptive Mixtures of Local Experts | MIT Press 1991 | 1991-03-01 | Robert A. Jacobs et al | Link |
Model Compression | 161 | Model Compression | ACM SIGKDD 2006 | 2006-08-20 | Cristian Bucil˘a et al | Link |
162 | Adaptive Computation Time for Recurrent Neural Networks | Behavioral and Brain Sciences | 2016-03-29 | Alex Graves et al | Link | |
163 | BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks | 2017-09-06 | Surat Teerapittayanon et al | Link | ||
164 | FastBERT: a Self-distilling BERT with Adaptive Inference Time | PMLR 2020 | 2020-04-05 | Weijie Liu et al | Link | |
165 | A Survey on Model Compression and Acceleration for Pretrained Language Models | ICLR 2022 | 2022-02-15 | Canwen Xu et al | Link | |
Knoweldge Distillation | 166 | Distilling the Knowledge in a Neural Network | NIPS 2014 | 2015-03-09 | Geoffrey Hinton et al | Link |
167 | Improved Knowledge Distillation via Teacher Assistant | ACSAC 2019 | 2019-02-09 | Seyed-Iman Mirzadeh et al | Link | |
168 | Unified Language Model Pre-training for Natural Language Understanding and Generation | 2019-05-08 | Li Dong et al | Link | ||
169 | Patient Knowledge Distillation for BERT Model Compression | EMNLP 2019 | 2019-08-25 | Siqi Sun et al | Link | |
170 | DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter | JMLR | 2019-10-02 | Victor Sanh et al | Link | |
171 | MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers | 2020-02-25 | Wenhui Wang et al | Link | ||
172 | MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers | PMLR 2021 | 2020-12-31 | Wenhui Wang et al | Link | |
173 | Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes | ACL 2023 | 2023-05-03 | Cheng-Yu Hsieh et al | Link | |
Quantization | 174 | Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation | NeurIPS 2013 | 2013-08-15 | Yoshua Bengio et al | Link |
175 | XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks | 2016-03-16 | Mohammad Rastegari et al | Link | ||
176 | DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients | IEEE 2017 | 2016-06-20 | Shuchang Zhou et al | Link | |
177 | Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations | 2016-09-22 | Itay Hubara et al | Link | ||
178 | Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference | 2017-12-15 | Benoit Jacob et al | Link | ||
179 | And the Bit Goes Down: Revisiting the Quantization of Neural Networks | TACL 2020 | 2019-07-12 | Pierre Stock et al | Link | |
180 | Learned Step Size Quantization | ICLR 2020 | 2019-02-21 | Steven K. Esser et al | Link | |
181 | Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT | 2019-09-12 | Sheng Shen et al | Link | ||
182 | Quantization Networks | 2019-11-21 | Jiwei Yang et al | Link | ||
183 | ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions | 2020-03-07 | Zechun Liu et al | Link | ||
184 | Binary Neural Networks: A Survey | 2020-03-31 | Haotong Qin et al | Link | ||
185 | Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation | ACSAC 2021 | 2020-04-20 | Hao Wu et al | Link | |
186 | BinaryBERT: Pushing the Limit of BERT Quantization | ACL-IJCNLP 2021 | 2020-12-31 | Haoli Bai et al | Link | |
187 | I-BERT: Integer-only BERT Quantization | USENIX 2021 | 2021-01-05 | Sehoon Kim et al | Link | |
188 | A Survey of Quantization Methods for Efficient Neural Network Inference | AAAI 2022 | 2021-03-25 | Amir Gholami et al | Link | |
189 | A White Paper on Neural Network Quantization | ICLR 2022 | 2021-06-15 | Markus Nagel et al | Link | |
190 | BiBERT: Accurate Fully Binarized BERT | 2022-03-12 | Haotong Qin et al | Link | ||
191 | BiT: Robustly Binarized Multi-distilled Transformer | IPS 2022 | 2022-05-25 | Zechun Liu et al | Link | |
192 | QLoRA: Efficient Finetuning of Quantized LLMs | ICLR 2024 | 2023-05-23 | Tim Dettmers et al | Link | |
193 | QuIP: 2-Bit Quantization of Large Language Models With Guarantees | 2023-07-25 | Jerry Chee et al | Link | ||
Reinforcement Learning | 194 | Technical Note: Q-Learning | Machien Learning | 1992-01-01 | Christopher J.C.H. Watkins et al | Link |
195 | Bayesian Q-learning | AAAI 1998 | 1998-01-01 | Richard Dearden et al | Link | |
196 | Policy invariance under reward transformations: Theory and application to reward shaping | 1999-01-01 | A. Ng et al | Link | ||
197 | Policy Shaping: Integrating Human Feedback with Reinforcement Learning | 2013-01-01 | Shane Griffith et al | Link | ||
198 | Playing Atari with Deep Reinforcement Learning | ICLR 2014 | 2013-12-19 | Volodymyr Mnih et al | Link | |
199 | Proximal Policy Optimization Algorithms | 2017-07-20 | John Schulman et al | Link | ||
200 | Rainbow: Combining Improvements in Deep Reinforcement Learning | AAAI 2018 | 2017-10-06 | Matteo Hessel et al | Link | |
201 | Time Limits in Reinforcement Learning | CVPR 2018 | 2017-12-01 | Fabio Pardo et al | Link | |
202 | Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research | NAACL 2018 | 2018-02-26 | Matthias Plappert et al | Link | |
203 | Learning to Generalize from Sparse and Underspecified Rewards | ICML 2019 | 2019-02-19 | Rishabh Agarwal et al | Link | |
204 | CURL: Contrastive Unsupervised Representations for Reinforcement Learning | 2020-04-08 | Aravind Srinivas et al | Link | ||
205 | Goal Density based Hindsight Experience Prioritization for Multi Goal Robot Manipulation Reinforcement Learning | ICLR 2021 | 2020-09-04 | Yingyi Kuang et al | Link | |
206 | The Role of Tactile Sensing in Learning and Deploying Grasp Refinement Algorithms | 2021-09-23 | Alexander Koenig et al | Link | ||
207 | The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks | 2021-10-12 | Rahim Entezari et al | Link | ||
208 | SoMoGym: A Toolkit for Developing and Evaluating Controllers and Reinforcement Learning Algorithms for Soft Robots | IEEE Robotics and Automation Letters 2022 | 2022-01-01 | Moritz A. Graule et al | Link | |
209 | Learning-Based Slip Detection for Robotic Fruit Grasping and Manipulation under Leaf Interference | NeurIPS 2022 | 2022-06-01 | Hongyu Zhou et al | Link | |
210 | Augmenting Vision-Based Grasp Plans for Soft Robotic Grippers using Reinforcement Learning | NeurIPS 2022 | 2022-08-24 | Vighnesh Vatsal et al | Link | |
211 | DreamWaQ: Learning Robust Quadrupedal Locomotion With Implicit Terrain Imagination via Deep Reinforcement Learning | ICRA 2023 | 2023-01-25 | I Made Aswin Nahrendra et al | Link | |
Information Retrieval | 212 | Document Expansion by Query Prediction | 2019-04-17 | Rodrigo Nogueira et al | Link | |
213 | Latent Retrieval for Weakly Supervised Open Domain Question Answering | 2019-06-01 | Kenton Lee et al | Link | ||
214 | ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT | SIGIR 2020 | 2020-04-27 | Omar Khattab et al | Link | |
215 | Differentially Private Learning Needs Better Features (or Much More Data) | 2020-11-23 | Florian Tramèr et al | Link | ||
216 | Learning Dense Representations of Phrases at Scale | ACL 2021 | 2020-12-23 | Jinhyuk Lee et al | Link | |
217 | BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models | 2021-04-17 | Nandan Thakur et al | Link | ||
218 | Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval | ICLR 2022 | 2021-08-12 | Luyu Gao et al | Link | |
219 | Multi-View Document Representation Learning for Open-Domain Dense Retrieval | ACL 2022 | 2022-03-16 | Shunyu Zhang et al | Link | |
220 | Learning Diverse Document Representations with Deep Query Interactions for Dense Retrieval | 2022-08-08 | Zehan Li et al | Link | ||
221 | CAPSTONE: Curriculum Sampling for Dense Retrieval with Document Expansion | EMNLP 2023 | 2022-12-18 | Xingwei He et al | Link | |
Tabular Learning | 222 | Compositional Semantic Parsing on Semi-Structured Tables | 2015-08-03 | Panupong Pasupat et al | Link | |
223 | Neural Text Generation from Structured Data with Application to the Biography Domain | EMNLP 2016 | 2016-03-24 | Remi Lebret et al | Link | |
224 | The E2E Dataset: New Challenges For End-to-End Generation | 2017-06-28 | Jekaterina Novikova et al | Link | ||
225 | Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning | 2017-08-31 | Victor Zhong et al | Link | ||
226 | Coarse-to-Fine Decoding for Neural Semantic Parsing | 2018-05-12 | Li Dong et al | Link | ||
227 | Policy Shaping and Generalized Update Equations for Semantic Parsing from Denotations | EMNLP 2018 | 2018-09-05 | Dipendra Misra et al | Link | |
228 | TabFact: A Large-scale Dataset for Table-based Fact Verification | ICLR 2020 | 2019-09-05 | Wenhu Chen et al | Link | |
229 | RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers | ACL 2020 | 2019-11-10 | Bailin Wang et al | Link | |
230 | TAPAS: Weakly Supervised Table Parsing via Pre-training | ACL 2020 | 2020-04-05 | Jonathan Herzig et al | Link | |
231 | TabTransformer: Tabular Data Modeling Using Contextual Embeddings | USENIX 2021 | 2020-12-11 | Xin Huang et al | Link | |
232 | FeTaQA: Free-form Table Question Answering | 2020-04-01 | Linyong Nan et al | Link | ||
233 | Logical Natural Language Generation from Open-Domain Tables | ACL 2020 | 2020-04-22 | Wenhu Chen et al | Link | |
234 | Logic2Text: High-Fidelity Natural Language Generation from Logical Forms | EMNLP 2020 | 2020-04-30 | Zhiyu Chen et al | Link | |
235 | TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data | ACL 2020 | 2020-05-17 | Pengcheng Yin et al | Link | |
236 | Program Enhanced Fact Verification with Verbalization and Graph Attention Network | EMNLP 2020 | 2020-10-06 | Xiaoyu Yang et al | Link | |
237 | DoT: An efficient Double Transformer for NLP tasks with tables | TACL 2022 | 2021-01-01 | Syrine Krichene et al | Link | |
238 | TAT-QA: A Question Answering Benchmark on a Hybrid of Tabular and Textual Content in Finance | NeurIPS 2021 | 2021-05-17 | Fengbin Zhu et al | Link | |
239 | SpreadsheetCoder: Formula Prediction from Semi-structured Context | ICML 2021 | 2021-06-26 | Xinyun Chen et al | Link | |
240 | TAPEX: Table Pre-training via Learning a Neural SQL Executor | ICLR 2022 | 2021-07-16 | Qian Liu et al | Link | |
241 | Logic-level Evidence Retrieval and Graph-based Verification Network for Table-based Fact Verification | EMNLP 2021 | 2021-09-14 | Qi Shi et al | Link | |
242 | Multi-Row, Multi-Span Distant Supervision For Table+Text Question | 2021-12-14 | Vishwajeet Kumar et al | Link | ||
243 | Learning to Generate Programs for Table Fact Verification via Structure-Aware Semantic Parsing | NeurIPS 2022 | 2022-01-01 | Suixin Ou et al | Link | |
244 | Enhancing Financial Table and Text Question Answering with Tabular Graph and Numerical Reasoning | AACL 2022 | 2022-01-01 | Rungsiman Nararatwong et al | Link | |
245 | Chain-of-Thought Prompting Elicits Reasoning in Large Language Models | 2022-01-28 | Jason Wei et al | Link | ||
246 | TableFormer: Robust Transformer Modeling for Table-Text Encoding | ACL 2022 | 2022-03-01 | Jingfeng Yang et al | Link | |
247 | Robust (Controlled) Table-to-Text Generation with Structure-Aware Equivariance Learning | EMNLP 2022 | 2022-05-08 | Fei Wang et al | Link | |
248 | R2D2: Robust Data-to-Text with Replacement Detection | EMNLP 2022 | 2022-05-25 | Linyong Nan et al | Link | |
249 | PLOG: Table-to-Logic Pretraining for Logical Table-to-Text Generation | 2022-05-25 | Ao Liu et al | Link | ||
250 | OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering | NeurIPS 2022 | 2022-07-08 | Zhengbao Jiang et al | Link | |
251 | Why do tree-based models still outperform deep learning on tabular data? | 2022-07-18 | Léo Grinsztajn et al | Link | ||
252 | APOLLO: An Optimized Training Approach for Long-form Numerical Reasoning | LREC 2024 | 2022-12-14 | Jiashuo Sun et al | Link | |
253 | An inner table retriever for robust table question answering | ACL 2023 | 2023-01-01 | Weizhe Lin et al | Link | |
254 | Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning | SIGIR 2023 | 2023-01-31 | Yunhu Ye et al | Link | |
255 | LoFT: Enhancing Faithfulness and Diversity for Table-to-Text Generation via Logic Form Control | EACL 2023 | 2023-02-06 | Yilun Zhao et al | Link | |
256 | DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction | IPS 2023 | 2023-04-21 | Mohammadreza Pourreza et al | Link | |
257 | Multi-View Graph Representation Learning for Answering Hybrid Numerical Reasoning Question | 2023-05-05 | Yifan Wei et al | Link | ||
258 | StructGPT: A General Framework for Large Language Model to Reason over Structured Data | 2023-05-16 | Jinhao Jiang et al | Link | ||
259 | Rethinking Tabular Data Understanding with Large Language Models | 2023-12-27 | Tianyang Liu et al | Link | ||
260 | CABINET: Content Relevance based Noise Reduction for Table Question Answering | ICLR 2024 | 2024-01-01 | Sohan Patnaik et al | Link | |
261 | Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding | ICLR 2024 | 2024-01-09 | Zilong Wang et al | Link | |
Security | 262 | Deep Learning with Differential Privacy | 2016-07-01 | Martín Abadi et al | Link | |
263 | Membership Inference Attacks against Machine Learning Models | 2016-10-18 | Reza Shokri et al | Link | ||
264 | Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks | IEEE Symposium on Security and Privacy (SP) 2019 | 2019-01-01 | Bolun Wang et al | Link | |
265 | STRIP: A Defence Against Trojan Attacks on Deep Neural Networks | ICLR 2020 | 2019-02-18 | Yansong Gao et al | Link | |
266 | BadNets: Evaluating Backdooring Attacks on Deep Neural Networks | NeurIPS 2019 | 2019-04-11 | Tianyu Gu et al | Link | |
267 | Regula Sub-rosa: Latent Backdoor Attacks on Deep Neural Networks | OpenAI | 2019-05-24 | Yuanshun Yao et al | Link | |
268 | Weight Poisoning Attacks on Pre-trained Models | ACL 2020 | 2020-04-14 | Keita Kurita et al | Link | |
269 | BadNL: Backdoor Attacks against NLP Models with Semantic-preserving Improvements | ACSAC 2021 | 2020-06-01 | Xiaoyi Chen et al | Link | |
270 | Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive Review | IEEE 2021 | 2020-07-21 | Yansong Gao et al | Link | |
271 | Trojaning Language Models for Fun and Profit | NeurIPS 2021 | 2020-08-01 | Xinyang Zhang et al | Link | |
272 | Scaling up Differentially Private Deep Learning with Fast Per-Example Gradient Clipping | PETS 2021 | 2020-09-07 | Jaewoo Lee et al | Link | |
273 | TextHide: Tackling Data Privacy in Language Understanding Tasks | EMNLP 2020 | 2020-10-02 | Yangsibo Huang et al | Link | |
274 | Extracting Training Data from Large Language Models | 2020-12-14 | Nicholas Carlini et al | Link | ||
275 | Hidden Backdoors in Human-Centric Language Models | CCS 2021 | 2021-01-01 | Shaofeng Li et al | Link | |
276 | T-Miner: A Generative Approach to Defend Against Trojan Attacks on DNN-based Text Classification | 2021-03-07 | Ahmadreza Azizi et al | Link | ||
277 | Large Language Models Can Be Strong Differentially Private Learners | 2021-10-12 | Xuechen Li et al | Link | ||
278 | Hidden Trigger Backdoor Attack on NLP Models via Linguistic Style Manipulation | USENIX Security 2022 | 2022-01-01 | Xudong Pan et al | Link | |
Computer Vision | 279 | Auto-Encoding Variational Bayes | 2013-12-20 | Diederik P Kingma et al | Link | |
280 | Show, Attend and Tell: Neural Image Caption Generation with Visual Attention | 2015-02-10 | Kelvin Xu et al | Link | ||
281 | InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets | 2016-06-12 | Xi Chen et al | Link | ||
282 | The Perception-Distortion Tradeoff | CVPR 2018 | 2017-11-16 | Yochai Blau et al | Link | |
283 | Do CIFAR-10 Classifiers Generalize to CIFAR-10? | 2018-06-01 | Benjamin Recht et al | Link | ||
284 | Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks | ICRA 2019 | 2018-10-24 | Michelle A. Lee et al | Link | |
285 | An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale | CVPR 2021 | 2020-10-22 | Alexey Dosovitskiy et al | Link | |
286 | Training data-efficient image transformers & distillation through attention | ACL 2021 | 2020-12-23 | Hugo Touvron et al | Link |
Google Schloar Crawler
What's recent paper cited by the paper