Skip to content

Commit

Permalink
Split optimization docs into multiple pages
Browse files Browse the repository at this point in the history
  • Loading branch information
stuhlmueller committed Feb 19, 2017
1 parent c5f62d8 commit f7b5528
Show file tree
Hide file tree
Showing 5 changed files with 207 additions and 208 deletions.
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ WebPPL Documentation
sample
distributions
inference/index
optimization
optimization/index
functions/index
globalstore
packages
Expand Down
207 changes: 0 additions & 207 deletions docs/optimization.rst

This file was deleted.

34 changes: 34 additions & 0 deletions docs/optimization/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
.. _optimization:

Optimization
============

Optimization provides an alternative approach to :ref:`marginal
inference <inference>`.

In this section we refer to the program for which we would like to
obtain the marginal distribution as the *target program*.

If we take a target program and add a :ref:`guide distribution
<guides>` to each random choice, then we can define the *guide
program* as the program you get when you sample from the guide
distribution at each ``sample`` statement and ignore all ``factor``
statements.

If we endow this guide program with adjustable parameters, then we can
optimize those parameters so as to minimize the distance between the
joint distribution of the choices in the guide program and those in
the target.

This general approach includes a number of well-known algorithms as
special cases.

It is supported in WebPPL by :ref:`a method for performing
optimization <optimize>`, primitives for specifying :ref:`parameters
<parameters>`, and the ability to specify guides.

.. toctree::
:maxdepth: 2

optimize
parameters
110 changes: 110 additions & 0 deletions docs/optimization/optimize.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
Optimize
========

.. _optimize:

.. js:function:: Optimize(options)

:param object options: Optimization options.
:returns: Nothing.

Optimizes the parameters of the guide program specified by the
``model`` option.

The following options are supported:

.. describe:: model

A function of zero arguments that specifies the target and guide
programs.

This option must be present.

.. describe:: steps

The number of optimization steps to take.

Default: ``1``

.. describe:: optMethod

The optimization method used. The following methods are
available:

* ``'sgd'``
* ``'adagrad'``
* ``'rmsprop'``
* ``'adam'``

Each method takes a ``stepSize`` sub-option, see below for
example usage. Additional method specific options are available,
see the `adnn optimization module`_ for details.

Default: ``'adam'``

.. describe:: estimator

Specifies the optimization objective and the method used to
estimate its gradients. See `Estimators`_.

Default: ``ELBO``

.. describe:: verbose

Default: ``true``


Example usage::

Optimize({model: model, steps: 100});
Optimize({model: model, optMethod: 'adagrad'});
Optimize({model: model, optMethod: {sgd: {stepSize: 0.5}}});

Estimators
----------

The following estimators are available:

.. _elbo:

.. describe:: ELBO

This is the evidence lower bound (ELBO). Optimizing this objective
yields variational inference.

For best performance use :js:func:`mapData` in place of
:js:func:`map` where possible when optimizing this objective. The
conditional independence information this provides is used to
reduce the variance of gradient estimates which can significantly
improve performance, particularly in the presence of discrete
random choices. Data sub-sampling is also supported through the use
of :js:func:`mapData`.

The following options are supported:

.. describe:: samples

The number of samples to take for each gradient estimate.

Default: ``1``

.. describe:: avgBaselines

Enable the "average baseline removal" variance reduction
strategy.

Default: ``true``

.. describe:: avgBaselineDecay

The decay rate used in the exponential moving average used to
estimate baselines.

Default: ``0.9``

Example usage::

Optimize({model: model, estimator: 'ELBO'});
Optimize({model: model, estimator: {ELBO: {samples: 10}}});

.. _adnn optimization module: https://github.com/dritchie/adnn/tree/master/opt
Loading

0 comments on commit f7b5528

Please sign in to comment.