This is reimplementation of "Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness" in Pytorch.
The codes are based on official repo (Tensorflow) and huggingface.
Original Paper : Link
Training environment : Ubuntu 18.04, python 3.6
pip3 install torch torchvision torchaudio
pip install scikit-learn
Download bert-base-uncased
checkpoint from hugginface-ckpt
Download bert-base-uncased
vocab file from hugginface-vocab
Download CLINC OOS intent detection benchmark dataset from tensorflow-dataset
The downloaded files' directory should be:
SNGP-BERT
ㄴckpt
ㄴbert-base-uncased-pytorch_model.bin
ㄴdataset
ㄴclinc_oos
ㄴtrain.csv
ㄴval.csv
ㄴtest.csv
ㄴtest_ood.csv
ㄴvocab
ㄴbert-base-uncased-vocab.txt
ㄴmodels
...
In their paper, the authors conducted OOD experiment for NLP using CLINC OOS intent detection benchmark dataset, the OOS dataset contains data for 150 in-domain services with 150 training sentences in each domain, and also 1500 natural out-of-domain utterances. You can download the dataset at Link.
Original dataset paper, and Github : Paper Link, Git Link
python main.py --train_or_test train --method sngp --device gpu --gpu 0
python main.py --train_or_test test --method sngp --device gpu --gpu 0
Results for SNGP-BERT
on CLINC OOS.
NOTE : Depending on the random seed, the result may be slightly different.
Version | ACC | AUROC | AUPRC |
---|---|---|---|
Paper (Tensorflow) | 96.6 | 0.969 | 0.880 |
Pytorch (batch size = 256) | 96.1 | 0.974 | 0.900 |
Pytorch (batch size = 64) | 95.9 | 0.972 | 0.894 |
[1] https://github.com/google/uncertainty-baselines/blob/main/baselines/clinc_intent/sngp.py
[2] https://huggingface.co/
[3] https://github.com/google/edward2