update to 2.4.0

RevanthGundala · Nov 3, 2021 · 69cdc43 · 69cdc43
1 parent 854b6c1
commit 69cdc43
Show file tree

Hide file tree

Showing 124 changed files with 13,269 additions and 149 deletions.
diff --git a/README.md b/README.md
@@ -6,7 +6,8 @@
 - Efficient *Markov Chain Monte Carlo (MCMC)*,
 - Black-box inference, no hand-tuning,
 - Excellent performance in terms of autocorrelation time and convergence rate,
-- Scale to multiple CPUs without any extra effort.
+- Scale to multiple CPUs without any extra effort,
+- Automated Convergence diagnostics.
 
 [![GitHub](https://img.shields.io/badge/GitHub-minaskar%2Fzeus-blue)](https://github.com/minaskar/zeus)
 [![arXiv](https://img.shields.io/badge/arXiv-2002.06212-red)](https://arxiv.org/abs/2002.06212)

diff --git a/docs/_build/doctrees/api.doctree b/docs/_build/doctrees/api.doctree
diff --git a/docs/_build/doctrees/api/callbacks.doctree b/docs/_build/doctrees/api/callbacks.doctree
diff --git a/docs/_build/doctrees/api/sampler.doctree b/docs/_build/doctrees/api/sampler.doctree
diff --git a/docs/_build/doctrees/cookbook.doctree b/docs/_build/doctrees/cookbook.doctree
diff --git a/docs/_build/doctrees/environment.pickle b/docs/_build/doctrees/environment.pickle
diff --git a/docs/_build/doctrees/faq.doctree b/docs/_build/doctrees/faq.doctree
diff --git a/docs/_build/doctrees/index.doctree b/docs/_build/doctrees/index.doctree
diff --git a/docs/_build/doctrees/nbsphinx/notebooks/convergence.ipynb b/docs/_build/doctrees/nbsphinx/notebooks/convergence.ipynb
diff --git a/docs/_build/doctrees/nbsphinx/notebooks/progress.ipynb b/docs/_build/doctrees/nbsphinx/notebooks/progress.ipynb
@@ -0,0 +1,124 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Incrementally saving progress to a file"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In many cases it is useful to save the chain to a file. This makes iit easier to post-process a long chain and makes things less disastrous if the computer crashes somewhere in the midle of an expensive MCMC run.\n",
+    "\n",
+    "In this recipe we are going to use the callback interface to save the samples and their corresponding log-probability values in a `.h5` file. To do this you need to have [``h5py``](https://docs.h5py.org/en/latest/build.html#pre-built-installation-recommended) installed.\n",
+    "\n",
+    "We will set up a simple problem of sampling from a normal/Gaussian distribution as an example:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import zeus\n",
+    "import numpy as np\n",
+    "\n",
+    "ndim = 2\n",
+    "nwalkers = 10\n",
+    "nsteps = 1000\n",
+    "\n",
+    "def log_prob(x):\n",
+    "    return -0.5*np.dot(x,x)\n",
+    "\n",
+    "x0 = 1e-3 * np.random.randn(nwalkers, ndim)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Where ``x0`` is the initial positions of the walkers.\n",
+    "\n",
+    "We will then initialise the sampler and start the MCMC run by providing the ``zeus.callbacks.SaveProgressCallback`` callback function."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Initialising ensemble of 10 walkers...\n",
+      "Sampling progress : 100%|██████████| 1000/1000 [00:01<00:00, 656.62it/s]\n"
+     ]
+    }
+   ],
+   "source": [
+    "sampler = zeus.EnsembleSampler(nwalkers, ndim, log_prob)\n",
+    "sampler.run_mcmc(x0, nsteps, callbacks=zeus.callbacks.SaveProgressCallback(\"saved_chains.h5\", ncheck=100))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The above piece of code saved the chain incrementally every ``ncheck=100`` steps to a file named ``saved_chains.h5``. We can now access the chains using the ``h5py`` package as follows:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "(1000, 10, 2)\n",
+      "(1000, 10)\n"
+     ]
+    }
+   ],
+   "source": [
+    "import h5py \n",
+    "\n",
+    "with h5py.File('saved_chains.h5', \"r\") as hf:\n",
+    "    samples = np.copy(hf['samples'])\n",
+    "    logprob_samples = np.copy(hf['logprob'])\n",
+    "\n",
+    "print(samples.shape)\n",
+    "print(logprob_samples.shape)"
+   ]
+  }
+ ],
+ "metadata": {
+  "interpreter": {
+   "hash": "42ef9c41c9809f9bfe38b73fa705c16bbb3d6fadc6a1917ff578a20446617baf"
+  },
+  "kernelspec": {
+   "display_name": "Python 3.7.10 64-bit ('nbodykit-env': conda)",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.10"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/docs/_build/doctrees/nbsphinx/notebooks_convergence_12_0.png b/docs/_build/doctrees/nbsphinx/notebooks_convergence_12_0.png
diff --git a/docs/_build/doctrees/nbsphinx/notebooks_convergence_12_1.png b/docs/_build/doctrees/nbsphinx/notebooks_convergence_12_1.png
diff --git a/docs/_build/doctrees/nbsphinx/notebooks_convergence_13_0.png b/docs/_build/doctrees/nbsphinx/notebooks_convergence_13_0.png
diff --git a/docs/_build/doctrees/nbsphinx/notebooks_convergence_14_0.png b/docs/_build/doctrees/nbsphinx/notebooks_convergence_14_0.png
diff --git a/docs/_build/doctrees/nbsphinx/notebooks_convergence_16_0.png b/docs/_build/doctrees/nbsphinx/notebooks_convergence_16_0.png
diff --git a/docs/_build/doctrees/notebooks/convergence.doctree b/docs/_build/doctrees/notebooks/convergence.doctree
diff --git a/docs/_build/doctrees/notebooks/progress.doctree b/docs/_build/doctrees/notebooks/progress.doctree
diff --git a/docs/_build/html/.buildinfo b/docs/_build/html/.buildinfo
@@ -1,4 +1,4 @@
 # Sphinx build info version 1
 # This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
-config: 0755be5405a74e16d41e5187e65c4512
+config: 51cd34c90186d950f94f7df346d54026
 tags: 645f666f9bcd5a90fca523b33c5a78b7
diff --git a/docs/_build/html/_images/notebooks_convergence_12_0.png b/docs/_build/html/_images/notebooks_convergence_12_0.png
diff --git a/docs/_build/html/_images/notebooks_convergence_12_1.png b/docs/_build/html/_images/notebooks_convergence_12_1.png
diff --git a/docs/_build/html/_images/notebooks_convergence_13_0.png b/docs/_build/html/_images/notebooks_convergence_13_0.png
diff --git a/docs/_build/html/_images/notebooks_convergence_14_0.png b/docs/_build/html/_images/notebooks_convergence_14_0.png
diff --git a/docs/_build/html/_images/notebooks_convergence_16_0.png b/docs/_build/html/_images/notebooks_convergence_16_0.png
diff --git a/docs/_build/html/_sources/api.rst.txt b/docs/_build/html/_sources/api.rst.txt
@@ -2,12 +2,13 @@
 API Reference
 =============
 
-**zeus** consists mainly of five parts:
+**zeus** consists mainly of six parts:
 
 .. toctree::
    :maxdepth: 2
 
    api/sampler
+   api/callbacks
    api/moves
    api/autocorr
    api/parallel

diff --git a/docs/_build/html/_sources/api/callbacks.rst.txt b/docs/_build/html/_sources/api/callbacks.rst.txt
@@ -0,0 +1,36 @@
+=============
+The Callbacks
+=============
+
+Starting from version 2.4.0, ``zeus`` supports callback functions. Those are functions that are 
+called in every iteration of a run. Among other things, these can be used to monitor useful quantities,
+assess convergence, and save the chains to disk. Custom callback functions can also be used. Sampling
+terminates if a callback function returns ``True`` and continues running while ``False`` or ``None`` is
+returned.
+
+Autocorrelation Callback
+========================
+
+.. autoclass:: zeus.callbacks.AutocorrelationCallback
+    :members:
+
+
+Split-R Callback
+================
+
+.. autoclass:: zeus.callbacks.SplitRCallback
+    :members:
+
+
+Minimum Iterations Callback
+===========================
+
+.. autoclass:: zeus.callbacks.MinIterCallback
+    :members:
+
+
+Save Progress Callback
+======================
+
+.. autoclass:: zeus.callbacks.SaveProgressCallback
+    :members:
diff --git a/docs/_build/html/_sources/cookbook.rst.txt b/docs/_build/html/_sources/cookbook.rst.txt
@@ -37,27 +37,33 @@ Parallelisation recipes
 
 .. _MPI and ChainManager: notebooks/MPI.ipynb
 
+.. raw:: html
 
-Saving Progress recipes
-=======================
+    <style>
+        .red {color:red; font-weight:bold;}
+        .b {color:#0000FF; background-color:white;}
+    </style>
 
-- `Tracking metadata using the blobs interface`_
-    We introduce the blobs interface. An easy way for the user to track arbitrary metadata for every sample of the chain.
+.. role:: red
 
-.. _Tracking metadata using the blobs interface: notebooks/blobs.ipynb
+Convergence Diagnostics and Saving Progress recipes :red:`NEW`
+==============================================================
 
-- Save progress using h5py. (soon)
-    Save chains into a file.
+- `Automated Convergence Diagnostics using the callback interface`_ :red:`NEW`
+    In this recipe we are going to use the callback interface to monitor convergence and stop sampling automatically.
 
+- `Saving progress to disk using h5py`_ :red:`NEW`
+    In this recipe we are going to use the callback interface to save the samples and their corresponding log-probability values in a ``.h5`` file.
 
+- `Tracking metadata using the blobs interface`_
+    We introduce the blobs interface. An easy way for the user to track arbitrary metadata for every sample of the chain.
+
+.. _Automated Convergence Diagnostics using the callback interface: notebooks/convergence.ipynb
 
-Autocorrelation Analysis recipes
-================================
+.. _Saving progress to disk using h5py: notebooks/progress.ipynb
+
+.. _Tracking metadata using the blobs interface: notebooks/blobs.ipynb
 
-- Measure the autocorrelation time and effective sample size of a chain (soon)
-    This recipe demonstrates how to compute the autocorrelation time of a chain (i.e. a measure of
-    the statistical independence of the samples). Having this we can also calculate the effective sample
-    size of the chain.
 
 
 .. toctree::
@@ -69,4 +75,6 @@ Autocorrelation Analysis recipes
     notebooks/multimodal.ipynb
     notebooks/multiprocessing.ipynb
     notebooks/MPI.ipynb
-    notebooks/blobs.ipynb
+    notebooks/blobs.ipynb
+    notebooks/progress.ipynb
+    notebooks/convergence.ipynb
diff --git a/docs/_build/html/_sources/faq.rst.txt b/docs/_build/html/_sources/faq.rst.txt
@@ -6,7 +6,7 @@ What is the acceptance rate of ``zeus``?
 ========================================
 
 Unlike most MCMC methods, ``zeus`` acceptance rate isn't varying during a run. As a matter of fact,
-its acceptance rate is identically 1 always. This is because of the Slice Sampler at its core.
+its acceptance rate is identically 1, always. This is because of the Slice Sampler at its core.
 
 
 Why should I use zeus instead of other MCMC samplers?
@@ -23,15 +23,15 @@ dimensionality and handle challenging distributions better.
 What are the walkers?
 =====================
 
-Walkers are the members of the ensemble and the explore the parameter space in parallel. Collectively,
-the converge to the target distribution.
+Walkers are the members of the ensemble. They are interacting parallel chains which collectively explore 
+the posterior mass.
 
 
 How many walkers should I use?
 ==============================
 
-At least twice the number of parameters of your problem. A good rule of thump is to use approximately
-2.5 times the number of parameters. If your distribution has multiple modes you may want to increase
+At least twice the number of parameters of your problem. A good rule of thump is to use between 2 and 4
+times the number of parameters. If your distribution has multiple modes/peaks you may want to increase
 the number of walkers.
 
 
@@ -40,14 +40,34 @@ How should I initialize the positions of the walkers?
 
 A good practice seems to be to initialize the walkers from a small ball close to the *Maximum a Posteriori*
 estimate. After a few autocorrelation times the walkers would have explored the rest of the usefull regions
-of the parameter space (a.k.a. the typical set), producing a great number of independent samples.
+of the parameter space (i.e. the typical set), producing a great number of independent samples.
 
 
 How long should I run ``zeus``?
 ===============================
 
 You don't have to run ``zeus`` for very long. If your goal is to produce 2D/1D contours and/or 1-sigma/2-sigma
 constraints for your parameters, running ``zeus`` for a few autocorrelation times (e.g. 10) is more than enough.
+You can also use the implemented callback functions (see Cookbook and API) to automate the termination of a run.
+
+
+What can I do if the first few iterations take too long to complete?
+====================================================================
+
+This usually occurs when the walkers are initialised closed to each other. During the  first ``10-100`` iterations 
+``zeus`` is tuning its proposal scale ``mu``. During that time ``zeus`` may do more model evaluations than usual. 
+Tuning of ``mu`` is faster if initialised from a large value. We thus recommend to set ``mu`` to an large value 
+(e.g. ``mu=1e3``) initially in the ``EnsembleSampler``.
+
+
+Is there any way to reduce the computational cost per iteration?
+================================================================
+
+``zeus``'s power originates in its flexibility. During each iteration, the walkers move along straight lines (i.e. slices)
+that cross the posterior mass. The construction of a slice involves two steps, an initial expanding/stepping-out and a subsequent
+shrinking procedure. One can decrease the computational cost per iteration by forcing ``zeus`` to conduct no expansions. This is 
+achieved by setting ``light_mode=True`` in the ``EnsembleSampler`` at the cost of reduced flexibility. If the target distribution
+is close to normal/Gaussian one then this procedure can cut the cost to half.
 
 
 What are the ``Moves`` and which one should I use?

diff --git a/docs/_build/html/_sources/index.rst.txt b/docs/_build/html/_sources/index.rst.txt
@@ -4,13 +4,23 @@
     :scale: 30 %
     :align: center
 
+.. raw:: html
+
+    <style>
+        .red {color:red; font-weight:bold;}
+        .b {color:#0000FF; background-color:white;}
+    </style>
+
+.. role:: red
+
 **zeus is a Python implementation of the Ensemble Slice Sampling method.**
 
 - Fast & Robust *Bayesian Inference*,
 - Efficient *Markov Chain Monte Carlo (MCMC)*,
 - Black-box inference, no hand-tuning,
 - Excellent performance in terms of autocorrelation time and convergence rate,
-- Scale to multiple CPUs without any extra effort.
+- Scale to multiple CPUs without any extra effort,
+- Automated Convergence diagnostics. :red:`NEW`
 
 
 .. image:: https://img.shields.io/badge/GitHub-minaskar%2Fzeus-blue
@@ -106,6 +116,12 @@ Copyright 2019-2021 Minas Karamanis and contributors.
 Changelog
 =========
 
+**2.4.0 (xx/11/21)**
+
+- Introduced callback interface.
+- Added convergence diagnostics.
+- Added ``H5DF`` support.
+
 **2.3.1 (03/08/21)**
 
 - Raise exception if model fails.