docs: revamp narrative

Signed-off-by: Han Xiao <[email protected]>
osmanbaskaya · Dec 7, 2022 · 77267b8 · 77267b8
1 parent b057b62
commit 77267b8
Show file tree

Hide file tree

Showing 173 changed files with 4,339 additions and 5,239 deletions.
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -255,17 +255,13 @@ Good docs make developers happy, and we love happy developers! We've got a few d
 10. Link to any existing explanations of the concepts you are using.
 
 Bonus: **Know when to break the rules**. Documentation writing is as much art as it is science. Sometimes you will have to deviate from these rules in order to write good documentation.
-Refer to these pages as standardized examples:
 
-* https://docs.jina.ai/fundamentals/flow/access-flow-api/ 
-* https://docs.jina.ai/fundamentals/flow/flow-api/
-* https://docs.jina.ai/how-to/flow-switch/ 
 
 [MyST](https://myst-parser.readthedocs.io/en/latest/) Elements Usage
 
-1. Use the `{tab}` element to show multiple ways of doing one thing. [Example](https://docs.jina.ai/fundamentals/flow/create-flow/#instantiate-a-flow) 
-2. Use the `{admonition}` boxes with care. We recommend restricting yourself to [Hint](https://docs.jina.ai/fundamentals/flow/create-flow/#create-a-flow), [Caution](https://docs.jina.ai/fundamentals/flow/flow-api/#add-graphql-endpoint) and [See Also](https://docs.jina.ai/fundamentals/flow/flow-api/#add-graphql-endpoint).
-3. Use `{dropdown}` to hide optional content, such as long code snippets or console output. [Example](https://docs.jina.ai/fundamentals/flow/access-flow-api/#use-http-client-to-send-request)
+1. Use the `{tab}` element to show multiple ways of doing one thing. [Example](https://docs.jina.ai/concepts/flow/create-flow/#instantiate-a-flow) 
+2. Use the `{admonition}` boxes with care. We recommend restricting yourself to [Hint](https://docs.jina.ai/concepts/flow/create-flow/#create-a-flow), [Caution](https://docs.jina.ai/concepts/flow/flow-api/#add-graphql-endpoint) and [See Also](https://docs.jina.ai/concepts/flow/flow-api/#add-graphql-endpoint).
+3. Use `{dropdown}` to hide optional content, such as long code snippets or console output. [Example](https://docs.jina.ai/concepts/flow/access-flow-api/#use-http-client-to-send-request)
 
 ### Building documentation on your local machine
 

diff --git a/README.md b/README.md
@@ -43,12 +43,13 @@ Applications built with Jina enjoy the following features out of the box:
   - Improved engineering efficiency thanks to the Jina AI ecosystem, so you can focus on innovating with the data applications you build.
   - Free CPU/GPU hosting via [Jina AI Cloud](https://cloud.jina.ai).
 
+<!-- end jina-description -->
 
 <p align="center">
 <a href="#"><img src="https://github.com/jina-ai/jina/blob/master/.github/readme/core-tree-graph.svg?raw=true" alt="Jina in Jina AI neural search ecosystem" width="100%"></a>
 </p>
 
-<!-- end jina-description -->
+
 
 
 ## [Documentation](https://docs.jina.ai)
@@ -70,10 +71,10 @@ Find more install options on [Apple Silicon/Windows](https://docs.jina.ai/get-st
 Document, Executor and Flow are three fundamental concepts in Jina.
 
 - [**Document**](https://docarray.jina.ai/) is the fundamental data structure.
-- [**Executor**](https://docs.jina.ai/fundamentals/executor/) is a Python class with functions that use Documents as IO.
-- [**Flow**](https://docs.jina.ai/fundamentals/flow/) ties Executors together into a pipeline and exposes it with an API gateway.
+- [**Executor**](https://docs.jina.ai/concepts/executor/) is a Python class with functions that use Documents as IO.
+- [**Flow**](https://docs.jina.ai/concepts/flow/) ties Executors together into a pipeline and exposes it with an API gateway.
 
-[The full glossary is explained here.](https://docs.jina.ai/fundamentals/architecture-overview/)
+[The full glossary is explained here.](https://docs.jina.ai/concepts/architecture-overview/)
 
 
 ---

diff --git a/docs/_static/main.css b/docs/_static/main.css
@@ -11,6 +11,8 @@ html.loaded-in-iframe .page .main {
 
 .sidebar-logo {
     max-width: 50%;
+    margin-top: 1em;
+    margin-bottom: 1em;
 }
 
 

diff --git a/docs/concepts/client/callbacks.md b/docs/concepts/client/callbacks.md
@@ -0,0 +1,226 @@
+(callback-functions)=
+# Callbacks
+
+After performing {meth}`~jina.clients.mixin.PostMixin.post`, you may want to further process the obtained results.
+
+For this purpose, Jina implements a promise-like interface, letting you specify three kinds of callback functions:
+
+- `on_done` is executed while streaming, after successful completion of each request
+- `on_error` is executed while streaming, whenever an error occurs in each request
+- `on_always` is always performed while streaming, no matter the success or failure of each request
+
+
+Note that these callbacks only work for requests (and failures) *inside the stream*, for example inside an Executor.
+If the failure is due to an error happening outside of 
+streaming, then these callbacks will not be triggered.
+For example, a `SIGKILL` from the client OS during the handling of the request, or a networking issue,
+will not trigger the callback.
+
+
+Callback functions in Jina expect a `Response` of the type {class}`~jina.types.request.data.DataRequest`, which contains resulting Documents,
+parameters, and other information.
+
+## Handle DataRequest in callbacks
+
+`DataRequest`s are objects that are sent by Jina internally. Callback functions process DataRequests, and `client.post()`
+can return DataRequests.
+
+`DataRequest` objects can be seen as a container for data relevant for a given request, it contains the following fields:
+
+````{tab} header
+
+The request header.
+
+```python
+from pprint import pprint
+
+from jina import Client
+
+Client().post(on='/', on_done=lambda x: pprint(x.header))
+```
+
+```console
+request_id: "ea504823e9de415d890a85d1d00ccbe9"
+exec_endpoint: "/"
+target_executor: ""
+```
+
+````
+
+````{tab} parameters
+
+The input parameters of the associated request. In particular, `DataRequest.parameters['__results__']` is a 
+reserved field that gets populated by Executors returning a Python `dict`. 
+Information in those returned `dict`s gets collected here, behind each Executor ID.
+
+```python
+from pprint import pprint
+
+from jina import Client
+
+Client().post(on='/', on_done=lambda x: pprint(x.parameters))
+```
+
+```console
+{'__results__': {}}
+```
+
+````
+
+````{tab} routes
+
+The routing information of the data request. It contains the which Executors have been called, and the order in which they were called.
+The timing and latency of each Executor is also recorded.
+
+```python
+from pprint import pprint
+
+from jina import Client
+
+Client().post(on='/', on_done=lambda x: pprint(x.routes))
+```
+
+```console
+[executor: "gateway"
+start_time {
+  seconds: 1662637747
+  nanos: 790248000
+}
+end_time {
+  seconds: 1662637747
+  nanos: 794104000
+}
+, executor: "executor0"
+start_time {
+  seconds: 1662637747
+  nanos: 790466000
+}
+end_time {
+  seconds: 1662637747
+  nanos: 793982000
+}
+]
+
+```
+
+````
+
+````{tab} docs
+The DocumentArray being passed between and returned by the Executors. These are the Documents usually processed in a callback function, and are often the main payload.
+
+```python
+from pprint import pprint
+
+from jina import Client
+
+Client().post(on='/', on_done=lambda x: pprint(x.docs))
+```
+
+```console
+<DocumentArray (length=0) at 5044245248>
+
+```
+````
+
+
+Accordingly, a callback that processing documents can be defined as:
+
+```{code-block} python
+---
+emphasize-lines: 4
+---
+from jina.types.request.data import DataRequest
+
+def my_callback(resp: DataRequest):
+    foo(resp.docs)
+```
+
+## Handle exceptions in callbacks
+
+Server error can be caught by Client's `on_error` callback function. You can get the error message and traceback from `header.status`:
+
+```python
+from pprint import pprint
+
+from jina import Flow, Client, Executor, requests
+
+
+class MyExec1(Executor):
+    @requests
+    def foo(self, **kwargs):
+        raise NotImplementedError
+
+
+with Flow(port=12345).add(uses=MyExec1) as f:
+    c = Client(port=f.port)
+    c.post(on='/', on_error=lambda x: pprint(x.header.status))
+```
+
+
+```text
+code: ERROR
+description: "NotImplementedError()"
+exception {
+  name: "NotImplementedError"
+  stacks: "Traceback (most recent call last):\n"
+  stacks: "  File \"/Users/hanxiao/Documents/jina/jina/serve/runtimes/worker/__init__.py\", line 181, in process_data\n    result = await self._data_request_handler.handle(requests=requests)\n"
+  stacks: "  File \"/Users/hanxiao/Documents/jina/jina/serve/runtimes/request_handlers/data_request_handler.py\", line 152, in handle\n    return_data = await self._executor.__acall__(\n"
+  stacks: "  File \"/Users/hanxiao/Documents/jina/jina/serve/executors/__init__.py\", line 301, in __acall__\n    return await self.__acall_endpoint__(__default_endpoint__, **kwargs)\n"
+  stacks: "  File \"/Users/hanxiao/Documents/jina/jina/serve/executors/__init__.py\", line 322, in __acall_endpoint__\n    return func(self, **kwargs)\n"
+  stacks: "  File \"/Users/hanxiao/Documents/jina/jina/serve/executors/decorators.py\", line 213, in arg_wrapper\n    return fn(executor_instance, *args, **kwargs)\n"
+  stacks: "  File \"/Users/hanxiao/Documents/jina/toy44.py\", line 10, in foo\n    raise NotImplementedError\n"
+  stacks: "NotImplementedError\n"
+  executor: "MyExec1"
+}
+```
+
+
+
+In the example below, our Flow passes the message then prints the result when successful.
+If something goes wrong, it beeps. Finally, the result is written to output.txt.
+
+```python
+from jina import Flow, Client, Document
+
+
+def beep(*args):
+    # make a beep sound
+    import sys
+
+    sys.stdout.write('\a')
+
+
+with Flow().add() as f, open('output.txt', 'w') as fp:
+    client = Client(port=f.port)
+    client.post(
+        '/',
+        Document(),
+        on_done=print,
+        on_error=beep,
+        on_always=lambda x: x.docs.save(fp),
+    )
+```
+
+````{admonition} What errors can be handled by the callback?
+:class: caution
+Callbacks can handle errors that are caused by Executors raising an Exception.
+
+A callback will not receive exceptions:
+- from the Gateway having connectivity errors with the Executors.
+- between the Client and the Gateway.
+````
+
+## Continue streaming when an error occurs
+
+`client.post()` accepts a `continue_on_error` parameter. When set to `True`, the Client will keep trying to send the remaining requests. The `continue_on_error` parameter will only apply
+to Exceptions caused by an Executor, but in case of network connectivity issues, an Exception will be raised.
+
+## Transient fault handling with retries
+
+`client.post()` accepts `max_attempts`, `initial_backoff`, `max_backoff` and `backoff_multiplier` parameters to control the capacity to retry requests, when a transient connectivity error occurs, using an exponential backoff strategy.
+This can help to overcome transient network connectivity issues. 
+
+The `max_attempts` parameter determines the number of sending attempts, including the original request.
+The `initial_backoff`, `max_backoff`, and `backoff_multiplier` parameters determine the randomized delay in seconds before retry attempts.
+
+The initial retry attempt will occur at random(0, initial_backoff). In general, the n-th attempt will occur at random(0, min(initial_backoff*backoff_multiplier**(n-1), max_backoff)).