forked from influxdata/kapacitor
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[WIP] Initial blob store design doc (influxdata#905)
* initial blob store design doc * Update BLOB_STORE_DESIGN.md * update docs to use tagging verbage
- Loading branch information
Nathaniel Cook
authored
Sep 21, 2016
1 parent
333fd6b
commit 93f2b2c
Showing
1 changed file
with
47 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
# Blob Store | ||
|
||
The blob store is a mechanism to store arbitrary data in Kapacitor. | ||
The data stored is immutable and opaque to Kapacitor. | ||
|
||
Data is stored as blobs where each blob has a unique ID. | ||
A tagging system is used to refer various blobs within the store. | ||
A blob may be tagged with a given name. | ||
A blob may be retrieved by its ID or a tag name. | ||
When retrieving a blob via a tag name, the most recently associated blob is returned for that tag. | ||
Tags may be updated, meaning they can be modified to point at a different blob. | ||
The history of a tag to blob associations are preserved. | ||
|
||
There are no specific limits on the size of a blob, and blobs can be streamed in and out of the store. | ||
|
||
## Uses | ||
|
||
The following details the various uses of the Kapacitor blob store. | ||
|
||
### Snapshots | ||
|
||
Kapacitor will periodically snapshot the state of a running task. (Currently only implemented for UDFs). | ||
When a task is started its previous snapshot or a named snapshot is restored. | ||
|
||
Kapacitor tasks construct a [DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph) of the data pipeline. | ||
Each step in this DAG is called a node. | ||
Snapshots are associated with a single node within a single task. | ||
All nodes are assigned IDs based on the DAG structure. | ||
When the DAG changes the previous snapshots are considered invalid an are no longer used to restore task state. | ||
|
||
### UDFs | ||
|
||
UDFs can explicitly save and request blobs from the store via the protobuf socket connection with Kapacitor. | ||
A common use case is to load and store trained model data. | ||
However you use the blob store within your UDF is up to you. | ||
|
||
|
||
## Design | ||
|
||
The blob store will use content addressable IDs(i.e. shasum of the content) and be exposed via the HTTP API of Kapacitor. | ||
|
||
Blobs can be created, named and deleted. | ||
Creating a blob will accept only the content of the blob data and return the ID of the blob. | ||
Naming a blob associates a specified name to the content of the blob. | ||
A naming history is recorded, allowing the users to determine the "version" history for a given name. | ||
Deleting a blob removes it from the store. | ||
|