Skip to content

Implementation of neural network calibrator using numpy, pytorch.

Notifications You must be signed in to change notification settings

thlee93/dnn-calibration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Confidence Calibration for Neural Networks

Implementation of calibration methods for neural networks. Calibrators are provided as a python function that directly operates on output logits. calibrate.py contains implementations of calibrators. Metrics and visualizing methods for confidence outputs are contained in train.py.

Implementation of newly introduced calibrator will be updated continuously.

Execution

Train a neural network form scratch and calibrate the confidence:

python main.py --dataset <dataset> --model_type <model> --optimizer <optim>

Load already trained networks and only calibrate the confidence:

python main.py --dataset <dataset> \
	       --model_type <model> \
	       --optimizer <optim> \
	       --load_model <path_to_models> 

Codes for models like resnet110 were copied and modified from akamaster's repository

Sample output

Reliability Diagram

List of implemented methods

  1. Histogram Binning
  2. Matrix Scaling
  3. Vector Scaling
  4. Temperature Scaling

Notes

  • Some implementation available online make training and validation dataset to be same. However, since probality output from neural networks are highly overfitted to the train dataset, calibrators failed to achieve reasonable performances in this case.
  • Histogram binning method requires user to designate adequate number of bins. This value should be carefully tuned in exploitation. Furthermore, when designing a histogram binning algorithm, users should decide how to split bins (each bin to be equally spaced or eqaully size). In our implementation we selected the former one.
  • Performance of temperature scaling reported in the paper seems to be achieved when LBFGS optimizer was used for temperature value. In our experiments, results fluctuated a lot even to the small changes in hyperparameters to the optimizer (learning rate, num_iterations).
  • Original paper reported that Expected Calibration Error of matrix scaling methods for CIFAR100 is near 0.25. It seems that such results were also obtained when LBFGS optimizer is used in training calibrator. When SGD or Adam is used, error values reduce to 0.15.

Reference papers

About

Implementation of neural network calibrator using numpy, pytorch.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages