Code for the paper: "Mitigating Bias in Calibration Error Estimation"
virtualenv -p python3 env3
source env3/bin/activate
pip install -r caltrain/requirements.txt
source env3/bin/activate
DATA_DIR='./caltrain/data' # This is the default value if omitted below
python -m caltrain.download_data --data_dir=${DATA_DIR}
Plots that are generated by each script are saved in a command-line configurable variable plot_dir=./caltrain/plots
by default. To speed up computation, some values have been precomputed and cached. Each plotting script is configured to read these data from a command-line configurable variable data_dir='./caltrain/data
by default. Generating Figure 3 requires downloading logit data from https://github.com/markus93/NN_calibration/tree/master/logits
into the data_dir
as well.
The following environment variables should be defined:
export MPLBACKEND=Agg
source env3/bin/activate
DATA_DIR='./caltrain/data' # This is the default value if omitted below
PLOT_DIR='./caltrain/plots' # This is the default value if omitted below
# Figure 1a left panel
python -m caltrain.plot_intro_reliability_diagram --plot_dir=${PLOT_DIR}
# Figure 1a right panel
python -m caltrain.plot_intro_ece_distribution --plot_dir=${PLOT_DIR}
# Figure 1b (both panels)
python -m caltrain.plot_tce_assumptions --plot_dir=${PLOT_DIR}
# Figure 2, Figure 7, Figure 8
python -m caltrain.plot_bias_heat_map --data_dir=${DATA_DIR} --plot_dir=${PLOT_DIR}
# Figure 3
python -m caltrain.plot_glm_beta_eece_sece --data_dir=${DATA_DIR} --plot_dir=${PLOT_DIR}
# Figure 4
python -m caltrain.plot_calibration_errors --data_dir=${DATA_DIR} --plot_dir=${PLOT_DIR}
# Figure 5
python -m caltrain.plot_ece_vs_tce --data_dir=${DATA_DIR} --plot_dir=${PLOT_DIR}
@article{roelofs2020mitigating,
title={Mitigating bias in calibration error estimation},
author={Roelofs, Rebecca and Cain, Nicholas and Shlens, Jonathon and Mozer, Michael C},
journal={arXiv preprint arXiv:2012.08668},
year={2020}
}