Skip to content
/ entropix Public

📰 Computing the information content of trained neural networks

License

Notifications You must be signed in to change notification settings

jxbz/entropix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Computing the Neural Information Content

Jeremy Bernstein   ·   Yisong Yue    

About

This repository contains code to compute the information content of an infinitely wide neural network. What is this information content, exactly? Well, it's the negative log probability of the orthant in function space that is consistent with the training labels, where probability is calculated with respect to the Gaussian measure on function space that is induced by the neural architecture.

For a training set with binary class labels, there are three steps to compute it:

  1. For all pairs of training inputs x_i and x_j, compute the correlations between network outputs under random sampling of the network weights: Sigma_ij := Expect_w [ f(x_i,w) f(x_j,w) ].
  2. For random vector z distributed Normally with mean 0 and covariance Sigma, estimate the probability p := Prob[sign z = c] where c is the binary vector of class labels.
  3. Return log 1/p.

And why is this interesting? Well, PAC-Bayes guarantees generalisation when the information content log 1/p is smaller than the number of training data points. More details are given in our paper.

Getting started

  • Run the unit tests:
python unit_test.py
  • Run the main script:
python pac-bayes.py
  • Generate the plots using the Jupyter notebook make_plots.ipynb.

Environment details

The code was run on:

  • Pytorch 1.5.0
  • Using docker container nvcr.io/nvidia/pytorch:20.03-py3
  • On an NVIDIA Titan RTX GPU, with driver version 440.82, CUDA Version: 10.2

For the exact version of the code used in arXiv:2103.01045, go back to commit 49cc144.

Citation

If you find this code useful, feel free to cite the paper:

@inproceedings{entropix,
  title={Computing the Information Content of Trained Neural Networks},
  author={Jeremy Bernstein and Yisong Yue},
  booktitle={Workshop on the Theory of Overparameterized Machine Learning},
  year={2021}
}

License

We are making our algorithm available under a CC BY-NC-SA 4.0 license.

About

📰 Computing the information content of trained neural networks

Resources

License

Stars

Watchers

Forks