Skip to content

Commit

Permalink
create README.md (huggingface#8682)
Browse files Browse the repository at this point in the history
* create README.md

* Apply suggestions from code review

Co-authored-by: Julien Chaumond <[email protected]>
  • Loading branch information
bino282 and julien-c authored Nov 23, 2020
1 parent b5187e3 commit 52585e4
Showing 1 changed file with 38 additions and 0 deletions.
38 changes: 38 additions & 0 deletions model_cards/NlpHUST/vibert4news-base-cased/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
language: vn
---

# BERT for Vietnamese is trained on more 20 GB news dataset

Apply for task sentiment analysis on using [AIViVN's comments dataset](https://www.aivivn.com/contests/6)

The model achieved 0.90268 on the public leaderboard, (winner's score is 0.90087)
Bert4news is used for a toolkit Vietnames(segmentation and Named Entity Recognition) at ViNLPtoolkit(https://github.com/bino282/ViNLP)

***************New Mar 11 , 2020 ***************

**[BERT](https://github.com/google-research/bert)** (from Google Research and the Toyota Technological Institute at Chicago) released with the paper [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805).

We use word sentencepiece, use basic bert tokenization and same config with bert base with lowercase = False.

You can download trained model:
- [tensorflow](https://drive.google.com/file/d/1X-sRDYf7moS_h61J3L79NkMVGHP-P-k5/view?usp=sharing).
- [pytorch](https://drive.google.com/file/d/11aFSTpYIurn-oI2XpAmcCTccB_AonMOu/view?usp=sharing).



Run training with base config

``` bash

python train_pytorch.py \
--model_path=bert4news.pytorch \
--max_len=200 \
--batch_size=16 \
--epochs=6 \
--lr=2e-5

```

### Contact information
For personal communication related to this project, please contact Nha Nguyen Van ([email protected]).

0 comments on commit 52585e4

Please sign in to comment.