Add the start of a new user guide.

noxiouz · Jun 25, 2014 · 5d83890 · 5d83890
1 parent 681882b
commit 5d83890
Show file tree

Hide file tree

Showing 5 changed files with 290 additions and 0 deletions.
diff --git a/docs/documentation.rst b/docs/documentation.rst
@@ -4,6 +4,7 @@ Tornado Documentation
 .. toctree::
    :titlesonly:
 
+   guide
    overview
    webframework
    networking

diff --git a/docs/guide.rst b/docs/guide.rst
@@ -0,0 +1,8 @@
+User's guide
+============
+
+.. toctree::
+
+   guide/intro
+   guide/async
+   guide/coroutines
diff --git a/docs/guide/async.rst b/docs/guide/async.rst
@@ -0,0 +1,111 @@
+Asynchronous and non-Blocking
+-----------------------------
+
+Real-time web features require a long-lived mostly-idle connection per
+user.  In a traditional synchronous web server, this implies devoting
+one thread to each user, which can be very expensive.
+
+To minimize the cost of concurrent connections, Tornado uses a
+single-threaded event loop.  This means that all application code
+should aim to be asynchronous and non-blocking because only one
+operation can be active at a time.
+
+The terms asynchronous and non-blocking are closely related and are
+often used interchangeably, but they are not quite the same thing.
+
+Blocking
+~~~~~~~~
+
+A function **blocks** when it waits for something to happen before
+returning.  A function may block for many reasons: network I/O, disk
+I/O, mutexes, etc.  In fact, *every* function blocks, at least a
+little bit, while it is running and using the CPU (for an extreme
+example that demonstrates why CPU blocking must be taken as seriously
+as other kinds of blocking, consider password hashing functions like
+`bcrypt <http://bcrypt.sourceforge.net/>`_, which by design use
+hundreds of milliseconds of CPU time, far more than a typical network
+or disk access).
+
+A function can be blocking in some respects and non-blocking in
+others.  For example, `tornado.httpclient` in the default
+configuration blocks on DNS resolution but not on other network access
+(to mitigate this use `.ThreadedResolver` or a
+``tornado.curl_httpclient`` with a properly-configured build of
+``libcurl``).  In the context of Tornado we generally talk about
+blocking in the context of network I/O, although all kinds of blocking
+are to be minimized.
+
+Asynchronous
+~~~~~~~~~~~~
+
+An **asynchronous** function returns before it is finished, and
+generally causes some work to happen in the background before
+triggering some future action in the application (as opposed to normal
+**synchronous** functions, which do everything they are going to do
+before returning).  There are many styles of asynchronous interfaces:
+
+* Callback argument
+* Return a placeholder (`.Future`, ``Promise``, ``Deferred``)
+* Deliver to a queue
+* Callback registry (e.g. POSIX signals)
+
+Regardless of which type of interface is used, asynchronous functions
+*by definition* interact differently with their callers; there is no
+free way to make a synchronous function asynchronous in a way that is
+transparent to its callers (systems like `gevent
+<http://www.gevent.org>`_ use lightweight threads to offer performance
+comparable to asynchronous systems, but they do not actually make
+things asynchronous).
+
+Examples
+~~~~~~~~
+
+Here is a sample synchronous function::
+
+    from tornado.httpclient import HTTPClient
+
+    def synchronous_fetch(url):
+        http_client = HTTPClient()
+        response = http_client.fetch(url)
+        return response.body
+
+And here is the same function rewritten to be asynchronous with a
+callback argument::
+
+    from tornado.httpclient import AsyncHTTPClient
+
+    def asynchronous_fetch(url, callback):
+        http_client = AsyncHTTPClient()
+        def handle_response(response):
+            callback(response.body)
+        http_client.fetch(url)
+
+And again with a `.Future` instead of a callback::
+
+    from tornado.concurrent import Future
+
+    def async_fetch_future(url):
+        http_client = AsyncHTTPClient()
+        my_future = Future()
+        fetch_future = http_client.fetch(url)
+        fetch_future.add_done_callback(
+            lambda f: my_future.set_result(f.result()))
+        return my_future
+
+The raw `.Future` version is more complex, but ``Futures`` are
+nonetheless recommended practice in Tornado because they have two
+major advantages.  Error handling is more consistent since the
+`.Future.result` method can simply raise an exception (as opposed to
+the ad-hoc error handling common in callback-oriented interfaces), and
+``Futures`` lend themselves well to use with coroutines.  Coroutines
+will be discussed in depth in the next section of this guide.  Here is
+the coroutine version of our sample function, which is very similar to
+the original synchronous version::
+
+    from tornado import gen
+
+    @gen.coroutine
+    def fetch_coroutine(url):
+        http_client = AsyncHTTPClient()
+        response = yield http_client.fetch(url)
+        return response.body
diff --git a/docs/guide/coroutines.rst b/docs/guide/coroutines.rst
@@ -0,0 +1,142 @@
+Coroutines
+==========
+
+**Coroutines** are the recommended way to write asynchronous code in
+Tornado.  Coroutines use the Python ``yield`` keyword to suspend and
+resume execution instead of a chain of callbacks (cooperative
+lightweight threads as seen in frameworks like `gevent
+<http://www.gevent.org>`_ are sometimes called coroutines as well, but
+in Tornado all coroutines use explicit context switches and are called
+as asynchronous functions).
+
+Coroutines are almost as simple as synchronous code, but without the
+expense of a thread.  They also `make concurrency easier
+<https://glyph.twistedmatrix.com/2014/02/unyielding.html>`_ to reason
+about by reducing the number of places where a context switch can
+happen.
+
+Example::
+
+    from tornado import gen
+
+    @gen.coroutine
+    def fetch_coroutine(url):
+        http_client = AsyncHTTPClient()
+        response = yield http_client.fetch(url)
+        # In Python versions prior to 3.3, returning a value from
+        # a generator is not allowed and you must use
+        #   raise gen.Return(response.body)
+        # instead.
+        return response.body
+
+How it works
+~~~~~~~~~~~~
+
+A function containing ``yield`` is a **generator**.  All generators
+are asynchronous; when called they return a generator object instead
+of running to completion.  The ``@gen.coroutine`` decorator
+communicates with the generator via the ``yield`` expressions, and
+with the coroutine's caller by returning a `.Future`.
+
+Here is a simplified version of the coroutine decorator's inner loop::
+
+    # Simplified inner loop of tornado.gen.Runner
+    def run(self):
+        # send(x) makes the current yield return x.
+        # It returns when the next yield is reached
+        future = self.gen.send(self.next)
+        def callback(f):
+            self.next = f.result()
+            self.run()
+        future.add_done_callback(callback)
+
+The decorator receives a `.Future` from the generator, waits (without
+blocking) for that `.Future` to complete, then "unwraps" the `.Future`
+and sends the result back into the generator as the result of the
+``yield`` expression.  Most asynchronous code never touches the `.Future`
+class directly except to immediately pass the `.Future` returned by
+an asynchronous function to a ``yield`` expression.
+
+Coroutine patterns
+~~~~~~~~~~~~~~~~~~
+
+Interaction with callbacks
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To interact with asynchronous code that uses callbacks instead of
+`.Future`, wrap the call in a `.Task`.  This will add the callback
+argument for you and return a `.Future` which you can yield::
+
+    @gen.coroutine
+    def call_task():
+        # Note that there are no parens on some_function.
+        # This will be translated by Task into
+        #   some_function(other_args, callback=callback)
+        yield gen.Task(some_function, other_args)
+
+Calling blocking functions
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The simplest way to call a blocking function from a coroutine is to
+use a `~concurrent.futures.ThreadPoolExecutor`, which returns
+``Futures`` that are compatible with coroutines::
+
+    thread_pool = ThreadPoolExecutor(4)
+
+    @gen.coroutine
+    def call_blocking():
+        yield thread_pool.submit(blocking_func, args)
+
+Parallelism
+^^^^^^^^^^^
+
+The coroutine decorator recognizes lists and dicts whose values are
+``Futures``, and waits for all of those ``Futures`` in parallel::
+
+    @gen.coroutine
+    def parallel_fetch(url1, url2):
+        resp1, resp2 = yield [http_client.fetch(url1),
+                              http_client.fetch(url2)]
+
+    @gen.coroutine
+    def parallel_fetch_many(urls):
+        responses = yield [http_client.fetch(url) for url in urls]
+        # responses is a list of HTTPResponses in the same order
+
+    @gen.coroutine
+    def parallel_fetch_dict(urls):
+        responses = yield {url: http_client.fetch(url)
+                            for url in urls}
+        # responses is a dict {url: HTTPResponse}
+
+Interleaving
+^^^^^^^^^^^^
+
+Sometimes it is useful to save a `.Future` instead of yielding it
+immediately, so you can start another operation before waiting::
+
+    @gen.coroutine
+    def get(self):
+        fetch_future = self.fetch_next_chunk()
+        while True:
+            chunk = yield fetch_future
+            if chunk is None: break
+            self.write(chunk)
+            fetch_future = self.fetch_next_chunk()
+            yield self.flush()
+
+Looping
+^^^^^^^
+
+Looping is tricky with coroutines since there is no way in Python
+to ``yield`` on every iteration of a ``for`` or ``while`` loop and
+capture the result of the yield.  Instead, you'll need to separate
+the loop condition from accessing the results, as in this example
+from `motor <http://motor.readthedocs.org/en/stable/>`_::
+
+    import motor
+    @gen.coroutine
+    def loop_example(collection):
+        cursor = collection.find()
+        while (yield cursor.fetch_next):
+            doc = cursor.next_object()
diff --git a/docs/guide/intro.rst b/docs/guide/intro.rst
@@ -0,0 +1,28 @@
+Introduction
+------------
+
+`Tornado <http://www.tornadoweb.org>`_ is a Python web framework and
+asynchronous networking library, originally developed at `FriendFeed
+<http://friendfeed.com>`_.  By using non-blocking network I/O, Tornado
+can scale to tens of thousands of open connections, making it ideal for
+`long polling <http://en.wikipedia.org/wiki/Push_technology#Long_polling>`_,
+`WebSockets <http://en.wikipedia.org/wiki/WebSocket>`_, and other
+applications that require a long-lived connection to each user.
+
+Tornado can be roughly divided into three major components:
+
+* A web framework (including `.RequestHandler` which is subclassed to
+  create web applications, and various supporting classes).
+* Client- and server-side implementions of HTTP (`.HTTPServer` and
+  `.AsyncHTTPClient`).
+* An asynchronous networking library (`.IOLoop` and `.IOStream`),
+  which serve as the building blocks for the HTTP components and can
+  also be used to implement other protocols.
+
+The Tornado web framework and HTTP server together offer a full-stack
+alternative to `WSGI <http://www.python.org/dev/peps/pep-3333/>`_.
+While it is possible to use the Tornado web framework in a WSGI
+container (`.WSGIAdapter`), or use the Tornado HTTP server as a
+container for other WSGI frameworks (`.WSGIContainer`), each of these
+combinations has limitations and to take full advantage of Tornado you
+will need to use the Tornado's web framework and HTTP server together.