nlp-benchmark

Datasets:

Dataset	Classes	Train samples	Test samples	source
Imdb	2	25 000	25 000	link
AG’s News	4	120 000	7 600	link
Sogou News	5	450 000	60 000	link
DBPedia	14	560 000	70 000	link
Yelp Review Polarity	2	560 000	38 000	link
Yelp Review Full	5	650 000	50 000	link
Yahoo! Answers	10	1 400 000	60 000	link
Amazon Review Full	5	3 000 000	650 000	link
Amazon Review Polarity	2	3 600 000	400 000	link

Models:

[1]: CNN: Character-level convolutional networks for text classification (paper)
[2]: VDCNN: Very deep convolutional networks for text classification (paper)
[3]: HAN: Hierarchical Attention Networks for Document Classification (paper), all credits goes to @cedias
[4]: Transformer Encoder: Attention Is All You Need (encoder part) (paper), credits to Yu-Hsiang Huang's work)

HAN word (red) and sentence (blue) attention weight at prediction:

Experiments:

Results are reported as follows: (i) / (ii)

(i): Test set accuracy reported by the paper
(ii): Test set accuracy reproduced here

Imdb

Model	paper accuracy	repo accuracy
CNN small
VDCNN 9 layers
VDCNN 17 layers
VDCNN 29 layers
HAN		90.5
Transformer		88.6

Ag news

Model	paper accuracy	repo accuracy
CNN small	84.35	88.30
VDCNN 9 layers	90.17	89.22
VDCNN 17 layers	90.61	90.00
VDCNN 29 layers	91.27	90.43
HAN		92.4
Transformer		93.2

Sogu news

Model	paper accuracy	repo accuracy
CNN small	91.35	93.53
VDCNN 9 layers	96.42	93.50
VDCNN 17 layers	96.49
VDCNN 29 layers	96.64	87.90
HAN		96.
Transformer		95.6

DBpedia

Model	paper accuracy	repo accuracy
CNN small	98.02	98.15
VDCNN 9 layers	98.75	98.35
VDCNN 17 layers	98.02	98.15
VDCNN 29 layers	98.71
HAN		99.0
Transformer		98.7

Yelp polarity

Model	paper accuracy	repo accuracy
CNN small
VDCNN 9 layers	94.73	93.97
VDCNN 17 layers	94.95	94.73
VDCNN 29 layers	95.72	94.75
HAN

Yelp review

Model	paper accuracy	repo accuracy
CNN small
VDCNN 9 layers	61.96	61.18
VDCNN 17 layers	62.59
VDCNN 29 layers	64.26	62.73
HAN		63.

Name		Name	Last commit message	Last commit date
Latest commit History 216 Commits
datasets		datasets
models		models
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nlp-benchmark

Datasets:

Models:

Experiments:

Imdb

Ag news

Sogu news

DBpedia

Yelp polarity

Yelp review

About

Releases

Packages

Contributors 2

Languages

ArdalanM/nlp-benchmarks

Folders and files

Latest commit

History

Repository files navigation

nlp-benchmark

Datasets:

Models:

Experiments:

Imdb

Ag news

Sogu news

DBpedia

Yelp polarity

Yelp review

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages