Skip to content

Commit

Permalink
[Serve] [Docs] Document Ray object ref serialization (ray-project#42748)
Browse files Browse the repository at this point in the history
This change documents how to serialize and deserialize object refs in Ray.

Link to updated docs: https://anyscale-ray--42748.com.readthedocs.build/en/42748/ray-core/objects/serialization.html#object-refs

---------

Signed-off-by: Shreyas Krishnaswamy <[email protected]>
Signed-off-by: shrekris-anyscale <[email protected]>
Co-authored-by: Stephanie Wang <[email protected]>
shrekris-anyscale and stephanie-wang authored Jan 30, 2024
1 parent 423cd67 commit f256fbf
Showing 2 changed files with 42 additions and 0 deletions.
25 changes: 25 additions & 0 deletions doc/source/ray-core/doc_code/object_ref_serialization.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
import ray
from ray import cloudpickle

FILE = "external_store.pickle"

ray.init()

my_dict = {"hello": "world"}

obj_ref = ray.put(my_dict)
with open(FILE, "wb+") as f:
cloudpickle.dump(obj_ref, f)

# ObjectRef remains pinned in memory because
# it was serialized with ray.cloudpickle.
del obj_ref

with open(FILE, "rb") as f:
new_obj_ref = cloudpickle.load(f)

# The deserialized ObjectRef works as expected.
assert ray.get(new_obj_ref) == my_dict

# Explicitly free the object.
ray._private.internal_api.free(new_obj_ref)
17 changes: 17 additions & 0 deletions doc/source/ray-core/objects/serialization.rst
Original file line number Diff line number Diff line change
@@ -23,6 +23,23 @@ Plasma is used to efficiently transfer objects across different processes and di

Each node has its own object store. When data is put into the object store, it does not get automatically broadcasted to other nodes. Data remains local to the writer until requested by another task or actor on another node.

Serializing ObjectRefs
~~~~~~~~~~~~~~~~~~~~~~

Explicitly serializing `ObjectRefs` using `ray.cloudpickle` should be used as a last resort. Passing `ObjectRefs` through Ray task arguments and return values is the recommended approach.

Ray `ObjectRefs` can be serialized using `ray.cloudpickle`. The `ObjectRef` can then be deserialized and accessed with `ray.get()`. Note that `ray.cloudpickle` must be used; other pickle tools are not guaranteed to work. Additionally, the process that deserializes the `ObjectRef` must be part of the same Ray cluster that serialized it.

When serialized, the `ObjectRef`'s value will remain pinned in Ray's shared memory object store. The object must be explicitly freed by calling `ray._private.internal_api.free(obj_ref)`.

.. warning::

`ray._private.internal_api.free(obj_ref)` is a private API and may be changed in future Ray versions.

This code example demonstrates how to serialize an `ObjectRef`, store it in external storage, deserialize and use it, and lastly free its object.

.. literalinclude:: /ray-core/doc_code/object_ref_serialization.py

Numpy Arrays
~~~~~~~~~~~~

0 comments on commit f256fbf

Please sign in to comment.