Skip to content

Commit

Permalink
[Doc] Re-organize API docs and tutorials (dmlc#1222)
Browse files Browse the repository at this point in the history
* reorg tutorials and api docs

* fix
  • Loading branch information
jermainewang authored Jan 26, 2020
1 parent f1a8f92 commit 5967d81
Show file tree
Hide file tree
Showing 28 changed files with 217 additions and 335 deletions.
2 changes: 1 addition & 1 deletion docs/source/api/python/batch.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.. _apibatch:

BatchedDGLGraph -- Enable batched graph operations
dgl.batched_graph
==================================================

.. currentmodule:: dgl
Expand Down
2 changes: 1 addition & 1 deletion docs/source/api/python/batch_heterograph.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.. _apibatch_heterograph:

BatchedDGLHeteroGraph -- Enable batched graph operations for heterographs
dgl.batched_heterograph
=========================================================================

.. currentmodule:: dgl
Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/python/data.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _apidata:

Dataset
=======
dgl.data
=========

.. currentmodule:: dgl.data

Expand Down
113 changes: 110 additions & 3 deletions docs/source/api/python/function.rst
Original file line number Diff line number Diff line change
@@ -1,9 +1,116 @@
.. _apifunction:

Builtin functions
=================
.. currentmodule:: dgl.function

.. automodule:: dgl.function
dgl.function
==================================

In DGL, message passing is expressed by two APIs:

- ``send(edges, message_func)`` for computing the messages along the given edges.
- ``recv(nodes, reduce_func)`` for collecting the incoming messages, perform aggregation and so on.

Although the two-stage abstraction can cover all the models that are defined in the message
passing paradigm, it is inefficient because it requires storing explicit messages. See the DGL
`blog post <https://www.dgl.ai/blog/2019/05/04/kernel.html>`_ for more
details and performance results.

Our solution, also explained in the blog post, is to fuse the two stages into one kernel so no
explicit messages are generated and stored. To achieve this, we recommend using our **built-in
message and reduce functions** so that DGL can analyze and map them to fused dedicated kernels. Here
are some examples (in PyTorch syntax).

.. code:: python
import dgl
import dgl.function as fn
import torch as th
g = ... # create a DGLGraph
g.ndata['h'] = th.randn((g.number_of_nodes(), 10)) # each node has feature size 10
g.edata['w'] = th.randn((g.number_of_edges(), 1)) # each edge has feature size 1
# collect features from source nodes and aggregate them in destination nodes
g.update_all(fn.copy_u('h', 'm'), fn.sum('m', 'h_sum'))
# multiply source node features with edge weights and aggregate them in destination nodes
g.update_all(fn.u_mul_e('h', 'w', 'm'), fn.max('m', 'h_max'))
# compute edge embedding by multiplying source and destination node embeddings
g.apply_edges(fn.u_mul_v('h', 'h', 'w_new'))
``fn.copy_u``, ``fn.u_mul_e``, ``fn.u_mul_v`` are built-in message functions, while ``fn.sum``
and ``fn.max`` are built-in reduce functions. We use ``u``, ``v`` and ``e`` to represent
source nodes, destination nodes, and edges among them, respectively. Hence, ``copy_u`` copies the source
node data as the messages, ``u_mul_e`` multiplies source node features with edge features, for example.

To define a unary message function (e.g. ``copy_u``) specify one input feature name and one output
message name. To define a binary message function (e.g. ``u_mul_e``) specify
two input feature names and one output message name. During the computation,
the message function will read the data under the given names, perform computation, and return
the output using the output name. For example, the above ``fn.u_mul_e('h', 'w', 'm')`` is
the same as the following user-defined function:

.. code:: python
def udf_u_mul_e(edges):
return {'m' : edges.src['h'] * edges.data['w']}
To define a reduce function, one input message name and one output node feature name
need to be specified. For example, the above ``fn.max('m', 'h_max')`` is the same as the
following user-defined function:

.. code:: python
def udf_max(nodes):
return {'h_max' : th.max(nodes.mailbox['m'], 1)[0]}
Broadcasting is supported for binary message function, which means the tensor arguments
can be automatically expanded to be of equal sizes. The supported broadcasting semantic
is standard and matches `NumPy <https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html>`_
and `PyTorch <https://pytorch.org/docs/stable/notes/broadcasting.html>`_. If you are not familiar
with broadcasting, see the linked topics to learn more. In the
above example, ``fn.u_mul_e`` will perform broadcasted multiplication automatically because
the node feature ``'h'`` and the edge feature ``'w'`` are of different shapes, but they can be broadcast.

All DGL's built-in functions support both CPU and GPU and backward computation so they
can be used in any `autograd` system. Also, built-in functions can be used not only in ``update_all``
or ``apply_edges`` as shown in the example, but wherever message and reduce functions are
required (e.g. ``pull``, ``push``, ``send_and_recv``).

Here is a cheatsheet of all the DGL built-in functions.

+-------------------------+-----------------------------------------------------------------+-----------------------+
| Category | Functions | Memo |
+=========================+=================================================================+=======================+
| Unary message function | ``copy_u`` | |
| +-----------------------------------------------------------------+-----------------------+
| | ``copy_e`` | |
| +-----------------------------------------------------------------+-----------------------+
| | ``copy_src`` | alias of ``copy_u`` |
| +-----------------------------------------------------------------+-----------------------+
| | ``copy_edge`` | alias of ``copy_e`` |
+-------------------------+-----------------------------------------------------------------+-----------------------+
| Binary message function | ``u_add_v``, ``u_sub_v``, ``u_mul_v``, ``u_div_v``, ``u_dot_v`` | |
| +-----------------------------------------------------------------+-----------------------+
| | ``u_add_e``, ``u_sub_e``, ``u_mul_e``, ``u_div_e``, ``u_dot_e`` | |
| +-----------------------------------------------------------------+-----------------------+
| | ``v_add_u``, ``v_sub_u``, ``v_mul_u``, ``v_div_u``, ``v_dot_u`` | |
| +-----------------------------------------------------------------+-----------------------+
| | ``v_add_e``, ``v_sub_e``, ``v_mul_e``, ``v_div_e``, ``v_dot_e`` | |
| +-----------------------------------------------------------------+-----------------------+
| | ``e_add_u``, ``e_sub_u``, ``e_mul_u``, ``e_div_u``, ``e_dot_u`` | |
| +-----------------------------------------------------------------+-----------------------+
| | ``e_add_v``, ``e_sub_v``, ``e_mul_v``, ``e_div_v``, ``e_dot_v`` | |
| +-----------------------------------------------------------------+-----------------------+
| | ``src_mul_edge`` | alias of ``u_mul_e`` |
+-------------------------+-----------------------------------------------------------------+-----------------------+
| Reduce function | ``max`` | |
| +-----------------------------------------------------------------+-----------------------+
| | ``min`` | |
| +-----------------------------------------------------------------+-----------------------+
| | ``sum`` | |
| +-----------------------------------------------------------------+-----------------------+
| | ``prod`` | |
| +-----------------------------------------------------------------+-----------------------+
| | ``mean`` | |
+-------------------------+-----------------------------------------------------------------+-----------------------+

Message functions
-----------------
Expand Down
2 changes: 1 addition & 1 deletion docs/source/api/python/graph.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.. _apigraph:

DGLGraph -- Untyped graph with node/edge features
dgl.DGLGraph
=========================================

.. currentmodule:: dgl
Expand Down
2 changes: 1 addition & 1 deletion docs/source/api/python/heterograph.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.. _apiheterograph:

DGLHeteroGraph -- Typed graph with node/edge features
dgl.DGLHeteroGraph
=====================================================

.. currentmodule:: dgl
Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/python/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,17 +6,17 @@ API Reference

graph
heterograph
init
batch
batch_heterograph
nn
init
function
traversal
propagate
udf
sampler
data
transform
nn
subgraph
graph_store
nodeflow
Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/python/model_zoo.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _apimodelzoo:

Model Zoo
=========
dgl.model_zoo
==============

.. currentmodule:: dgl.model_zoo

Expand Down
2 changes: 1 addition & 1 deletion docs/source/api/python/nn.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.. _apinn:

NN Modules
dgl.nn
==========

.. automodule:: dgl.nn
Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/python/nodeflow.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _apinodeflow:

NodeFlow -- Graph sampled from a large graph
============================================
dgl.nodeflow
==============

.. currentmodule:: dgl
.. autoclass:: NodeFlow
Expand Down
9 changes: 7 additions & 2 deletions docs/source/api/python/propagate.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,13 @@
Message Propagation
===================
dgl.propagate
===============

.. automodule:: dgl.propagate

Propagate messages and perform computation following graph traversal order. ``prop_nodes_XXX``
calls traversal algorithm ``XXX`` and triggers :func:`~DGLGraph.pull()` on the visited node
set at each iteration. ``prop_edges_YYY`` applies traversal algorithm ``YYY`` and triggers
:func:`~DGLGraph.send_and_recv()` on the visited edge set at each iteration.

.. autosummary::
:toctree: ../../generated/

Expand Down
4 changes: 3 additions & 1 deletion docs/source/api/python/random.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
.. _apirandom:

DGL Random Number Generator Controls
dgl.random
====================================

.. automodule:: dgl.random

Utilities used to control DGL's random number generator.

.. autosummary::
:toctree: ../../generated

Expand Down
8 changes: 6 additions & 2 deletions docs/source/api/python/sampler.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,11 @@
.. apisampler
Graph samplers
==============
dgl.contrib.sampling
======================

Module for sampling algorithms on graph. Each algorithm is implemented as a
data loader, which produces sampled subgraphs (called Nodeflow) at each
iteration.

.. autofunction:: dgl.contrib.sampling.sampler.NeighborSampler
.. autofunction:: dgl.contrib.sampling.sampler.LayerSampler
Expand Down
2 changes: 1 addition & 1 deletion docs/source/api/python/subgraph.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.. _apisubgraph:

DGLSubGraph -- Class for subgraph data structure
dgl.subgraph
================================================

.. currentmodule:: dgl.subgraph
Expand Down
4 changes: 3 additions & 1 deletion docs/source/api/python/transform.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
.. _api-transform:

Transform -- Graph Transformation
dgl.transform
=================================

.. automodule:: dgl.transform

Common algorithms on graphs.

.. autosummary::
:toctree: ../../generated/

Expand Down
12 changes: 11 additions & 1 deletion docs/source/api/python/traversal.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,18 @@
Graph Traversal
dgl.traversal
===============

.. automodule:: dgl.traversal

Graph traversal algorithms implemented as python generators, which returns the visited set
of nodes or edges at each iteration. The naming convention
is ``<algorithm>_[nodes|edges]_generator``. An example usage is as follows.

.. code:: python
g = ... # some DGLGraph
for nodes in dgl.bfs_nodes_generator(g, 0):
do_something(nodes)
.. autosummary::
:toctree: ../../generated/

Expand Down
8 changes: 4 additions & 4 deletions docs/source/api/python/udf.rst
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
.. _apiudf:

User-defined function related data structures
dgl.udf
==================================================

.. currentmodule:: dgl.udf
.. automodule:: dgl.udf

There are two types of user-defined functions in DGL:
User-defined functions (UDFs) are flexible ways to configure message passing computation.
There are two types of UDFs in DGL:

* **Node UDF** of signature ``NodeBatch -> dict``. The argument represents
a batch of nodes. The returned dictionary should have ``str`` type key and ``tensor``
Expand All @@ -15,7 +15,7 @@ There are two types of user-defined functions in DGL:
a batch of edges. The returned dictionary should have ``str`` type key and ``tensor``
type values.

Note: the size of the batch dimension is determined by the DGL framework
The size of the batch dimension is determined by the DGL framework
for good efficiency and small memory footprint. Users should not make
assumption in the batch dimension.

Expand Down
5 changes: 2 additions & 3 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,9 +195,8 @@
from sphinx_gallery.sorting import FileNameSortKey

examples_dirs = ['../../tutorials/basics',
'../../tutorials/models',
'../../tutorials/hetero'] # path to find sources
gallery_dirs = ['tutorials/basics','tutorials/models','tutorials/hetero'] # path to generate docs
'../../tutorials/models'] # path to find sources
gallery_dirs = ['tutorials/basics', 'tutorials/models'] # path to generate docs
reference_url = {
'dgl' : None,
'numpy': 'http://docs.scipy.org/doc/numpy/',
Expand Down
Loading

0 comments on commit 5967d81

Please sign in to comment.