Split optimization docs into multiple pages

GIntelligence · Feb 19, 2017 · f7b5528 · f7b5528
1 parent c5f62d8
commit f7b5528
Show file tree

Hide file tree

Showing 5 changed files with 207 additions and 208 deletions.
diff --git a/docs/index.rst b/docs/index.rst
@@ -21,7 +21,7 @@ WebPPL Documentation
    sample
    distributions
    inference/index
-   optimization
+   optimization/index
    functions/index
    globalstore
    packages

diff --git a/docs/optimization.rst b/docs/optimization.rst
diff --git a/docs/optimization/index.rst b/docs/optimization/index.rst
@@ -0,0 +1,34 @@
+.. _optimization:
+
+Optimization
+============
+
+Optimization provides an alternative approach to :ref:`marginal
+inference <inference>`.
+
+In this section we refer to the program for which we would like to
+obtain the marginal distribution as the *target program*.
+
+If we take a target program and add a :ref:`guide distribution
+<guides>` to each random choice, then we can define the *guide
+program* as the program you get when you sample from the guide
+distribution at each ``sample`` statement and ignore all ``factor``
+statements.
+
+If we endow this guide program with adjustable parameters, then we can
+optimize those parameters so as to minimize the distance between the
+joint distribution of the choices in the guide program and those in
+the target.
+
+This general approach includes a number of well-known algorithms as
+special cases.
+
+It is supported in WebPPL by :ref:`a method for performing
+optimization <optimize>`, primitives for specifying :ref:`parameters
+<parameters>`, and the ability to specify guides.
+
+.. toctree::
+   :maxdepth: 2
+
+   optimize
+   parameters
diff --git a/docs/optimization/optimize.rst b/docs/optimization/optimize.rst
@@ -0,0 +1,110 @@
+Optimize
+========
+
+.. _optimize:
+
+.. js:function:: Optimize(options)
+
+   :param object options: Optimization options.
+   :returns: Nothing.
+
+   Optimizes the parameters of the guide program specified by the
+   ``model`` option.
+
+   The following options are supported:
+
+   .. describe:: model
+
+      A function of zero arguments that specifies the target and guide
+      programs.
+
+      This option must be present.
+
+   .. describe:: steps
+
+      The number of optimization steps to take.
+
+      Default: ``1``
+
+   .. describe:: optMethod
+
+      The optimization method used. The following methods are
+      available:
+
+      * ``'sgd'``
+      * ``'adagrad'``
+      * ``'rmsprop'``
+      * ``'adam'``
+
+      Each method takes a ``stepSize`` sub-option, see below for
+      example usage. Additional method specific options are available,
+      see the `adnn optimization module`_ for details.
+
+      Default: ``'adam'``
+
+   .. describe:: estimator
+
+      Specifies the optimization objective and the method used to
+      estimate its gradients. See `Estimators`_.
+
+      Default: ``ELBO``
+
+   .. describe:: verbose
+
+      Default: ``true``
+
+
+Example usage::
+
+  Optimize({model: model, steps: 100});
+  Optimize({model: model, optMethod: 'adagrad'});
+  Optimize({model: model, optMethod: {sgd: {stepSize: 0.5}}});
+
+Estimators
+----------
+
+The following estimators are available:
+
+.. _elbo:
+
+.. describe:: ELBO
+
+   This is the evidence lower bound (ELBO). Optimizing this objective
+   yields variational inference.
+
+   For best performance use :js:func:`mapData` in place of
+   :js:func:`map` where possible when optimizing this objective. The
+   conditional independence information this provides is used to
+   reduce the variance of gradient estimates which can significantly
+   improve performance, particularly in the presence of discrete
+   random choices. Data sub-sampling is also supported through the use
+   of :js:func:`mapData`.
+
+   The following options are supported:
+
+   .. describe:: samples
+
+      The number of samples to take for each gradient estimate.
+
+      Default: ``1``
+
+   .. describe:: avgBaselines
+
+      Enable the "average baseline removal" variance reduction
+      strategy.
+
+      Default: ``true``
+
+   .. describe:: avgBaselineDecay
+
+      The decay rate used in the exponential moving average used to
+      estimate baselines.
+
+      Default: ``0.9``
+
+Example usage::
+
+  Optimize({model: model, estimator: 'ELBO'});
+  Optimize({model: model, estimator: {ELBO: {samples: 10}}});
+
+.. _adnn optimization module: https://github.com/dritchie/adnn/tree/master/opt