Skip to content

Commit

Permalink
[CELEBORN-824][DOC] Add PushData document
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?
As title.

### Why are the changes needed?
As title

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
No.

Closes apache#1747 from waitinfuture/824.

Authored-by: zky.zhoukeyong <[email protected]>
Signed-off-by: zky.zhoukeyong <[email protected]>
  • Loading branch information
waitinfuture committed Jul 24, 2023
1 parent 2752154 commit 8e84964
Show file tree
Hide file tree
Showing 5 changed files with 124 additions and 6 deletions.
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ cd $SPARK_HOME
--conf spark.shuffle.service.enabled=false
```
Then run the following test case:
```shell
```scala
spark.sparkContext.parallelize(1 to 10, 10)
.flatMap( _ => (1 to 100).iterator
.map(num => num)).repartition(10).count
Expand Down
4 changes: 4 additions & 0 deletions docs/assets/img/softsplit.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 4 additions & 4 deletions docs/developers/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,15 @@ please refer to dedicated articles.
In distributed compute engines, data exchange between compute nodes is common but expensive. The cost comes from
the disk and network inefficiency (M * N between Mappers and Reducers) in traditional shuffle frame, as following:

![ESS](/assets/img/ess.svg)
![ESS](../../assets/img/ess.svg)

Besides inefficiency, traditional shuffle framework requires large local storage in compute node to store shuffle
data, thus blocks the adoption of disaggregated architecture.

Apache Celeborn(Incubating) solves the problems by reorganizing shuffle data in a more efficient way, and storing the data in
a separate service. The high level architecture of Celeborn is as follows:

![Celeborn](/assets/img/celeborn.svg)
![Celeborn](../../assets/img/celeborn.svg)

## Components
Celeborn(Incubating) has three primary components: Master, Worker, and Client.
Expand Down Expand Up @@ -73,8 +73,8 @@ it just needs one network connection and sequentially read the coarse grained fi
In abnormal cases, such as when the file grows too large, or push data fails, Celeborn spawns a new split of the
`PartitionLocation`, and future data within the partition will be pushed to the new split.

Client keeps the split information and tells reducer to read from all splits of the `PartitionLocation` to guarantee
no data is lost.
`LifecycleManager` keeps the split information and tells reducer to read from all splits of the `PartitionLocation`
to guarantee no data is lost.

## Data Storage
Celeborn stores shuffle data in configurable multiple layers, i.e. `Memroy`, `Local Disks`, `Distributed File System`,
Expand Down
Loading

0 comments on commit 8e84964

Please sign in to comment.