Skip to content

Commit

Permalink
Added multimodal datasets and SOTA models
Browse files Browse the repository at this point in the history
  • Loading branch information
gangeshwark committed Jun 23, 2018
1 parent fcfc90d commit e44ce7e
Showing 1 changed file with 32 additions and 0 deletions.
32 changes: 32 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,10 @@ datasets for other languages.
- [Sentihood](#sentihood)
- [SST](#sst)
- [Yelp](#yelp)
- [Multimodal Sentiment Analysis](#multimodal-sentiment-analysis)
- [MOSI](#mosi)
- [Multimodal Emotion Recognition](multimodal-emotion-recognition)
- [IEMOCAP](#iemocap)
- [Semantic parsing](#semantic-parsing)
- [WikiSQL](#wikisql)
- [Semantic role labeling](#semantic-role-labeling)
Expand Down Expand Up @@ -725,6 +729,34 @@ Binary classification:
| CNN (Johnson and Zhang, 2016) | 2.90 | [Supervised and Semi-Supervised Text Categorization using LSTM for Region Embeddings](https://arxiv.org/abs/1602.02373) |
| Char-level CNN (Zhang et al., 2015) | 4.88 | [Character-level Convolutional Networks for Text Classification](https://papers.nips.cc/paper/5782-character-level-convolutional-networks-for-text-classification.pdf) |

## Multimodal Sentiment Analysis

### MOSI
The MOSI dataset [Zadeh et al., 2016](https://arxiv.org/pdf/1606.06259.pdf) is a dataset rich in sentimental expressions where 93 people review topics in English. The videos are segmented with each segments sentiment label scored between +3 (strong positive) to -3 (strong negative) by 5 annotators.

| Model | Accuracy | Paper / Source |
| ------------- | :-----:| --- |
| bc-LSTM (Poria et al., 2018) | 80.3% | [Context-Dependent Sentiment Analysis in User-Generated Videos](http://sentic.net/context-dependent-sentiment-analysis-in-user-generated-videos.pdf) |

## Multimodal Emotion Recognition

### IEMOCAP
The IEMOCAP [Busso et al., 2008](https://link.springer.com/article/10.1007/s10579-008-9076-6) contains the acts of 10 speakers in a two-way conversation segmented into utterances. The medium of the conversations in all the videos is English. The database contains the following categorical labels: anger, happiness, sadness, neutral, excitement, frustration, fear, surprise, and other.

**Monologue:**

| Model | Accuracy | Paper / Source |
| ------------- | :-----:| --- |
| bc-LSTM (Poria et al., 2018) | 76.10% | [Context-Dependent Sentiment Analysis in User-Generated Videos](http://sentic.net/context-dependent-sentiment-analysis-in-user-generated-videos.pdf) |

**Conversational:**

| Model | Score | Paper / Source |
| ------------- | :-----:| --- |
| CMN (Hazarika et al., 2018) | WAA = 77.62% | [Conversational Memory Network for Emotion Recognition in Dyadic Dialogue Videos](http://aclweb.org/anthology/N18-1193) |



## Semantic parsing

Semantic parsing is the task of translating natural language into a formal meaning
Expand Down

0 comments on commit e44ce7e

Please sign in to comment.