Skip to content

Test classic models vs BERT performance on the IMBD movie review + a sneak peak on BERT attention weights

License

Notifications You must be signed in to change notification settings

BabaSanfour/Text-Classification--Classic-models-vs-BERT

Repository files navigation

In this notebook we have a copmarison between 4 classic ML models from different families (LogReg using L2 and L1, RandomForest and XGboost) and a transofomer: Bidirectional Encoder Representations from Transformers. Comparison was done using accuracy and ROC AUC, a two popular metrics in the machine learning field.

For the classic models, we used TF-IDF to generate tokens and train the models.

For BERT, we used the pretrained version provided by huggingface: BERTforSequenceClassification.

The task was binary classification of positive vs negative reviews on the famous IMDB movies dataset The IMDB Reviews data can be downloaded from here: http://ai.stanford.edu/~amaas/data/sentiment/.

At the end we also added some attention weights for the BERT heads visualization using heatmaps

About

Test classic models vs BERT performance on the IMBD movie review + a sneak peak on BERT attention weights

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published