Self Normalizing Neural Networks for Natural Language Processing

Introduction

Self-Normalizing Neural Networks (SNNs) were introduced by Klambauer et al., to enable high-level abstract representations in Feed-forward neural networks (FNNs). This type of networks use a novel activation function called "scaled exponential units" (SELUs), which induce self-normalizing properties. Consequently, SNNs do not face vanishing and exploding gradient problems. This allows SNNs to train deep networks with many layers, employ strong regularization, and to make learning highly robust.

Related works

In their publication, Eger et al., conduct a compar ison of activation functions across several different NLP tasks (and task types) and using different neural network types: MLP, CNN, RNN and LSTM networks.

Madasu et al., also study the effectiveness of SNNs for different text classification tasks. They test two CNNs architectures, Self-normalizing Convolutional Neural Networks (SCNN) with ELU as activation function, and SCNN with SELU as activation function.

Goal

The goal of this project is to test SELU activation function in different architectures and apply it to different sentiment analysis tasks. If time allows it, other activation functions such as swish can be tested.

Sentiment Analysis

Sentiment analysis is the task of classifying the polarity of a given text. For this task there are several data sets available to which state of the art algorithms have been applied and to which we can compare the results obtained:

Additionally:

Estimated Timeline

Network design: 25h
Network training and tuning: 15h
Implementation: 8h
Report writing: 4h

Second Delivery Update

Neural Networks for Sentiment Analysis

We have researched, built and tested several architectures:

Convolutional Neural Network
- We tested simple and multi branch convolutional neural networks.
- Implemented and tested the Very Deep Convolutional Network model from VDCNN, EACL2017
- Implemented and tested the Deep Pyramid Convolutional Network model from DPCNN, ACL2017
Recurrent Neural Networks: we designed and tested LSTM networks, both simple and bidirectional.
Hybrid Neural networks: we designed different networks that use LSTM and Convolutional layers.

All these models are built and implemented using Tensorflow 2 on Google Collab and the links to those files on Google Drive can be found in their respective Python Notebooks. Notebooks show the results obtained in these preliminary experiments.

Results

Models were evaluated on the IMDB reviews dataset for Sentiment Classification, using a 5-CV method.

The loss metric used is Binary Cross-Entropy, although for the IMDB dataset the site NLP Progress only reports the accuracy those publications obtained.

Our goal is to achieve at least a 91.8 accuracy, as high as the lowest paper reported in that page. Our highest accuracy is 90.4 (validation set accuracy), unexpectedly obtained by one of the simplest models, the simple CNN model.

Preliminary results show that SELU activation actually worsens the quality of the models, and this negative effect is increased on deeper networks (such as DPCNN or VDCNN). In order to expand the results obtained we plan to run tests on both Amazon reviews and Yelp reviews dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
deployment		deployment
models		models
notebooks		notebooks
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self Normalizing Neural Networks for Natural Language Processing

Introduction

Related works

Goal

Sentiment Analysis

Estimated Timeline

Second Delivery Update

Neural Networks for Sentiment Analysis

Results

About

Releases

Packages

Languages

DiegoAnas/SNN-NLP

Folders and files

Latest commit

History

Repository files navigation

Self Normalizing Neural Networks for Natural Language Processing

Introduction

Related works

Goal

Sentiment Analysis

Estimated Timeline

Second Delivery Update

Neural Networks for Sentiment Analysis

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages