Skip to content

Commit

Permalink
Added parse.rst and mrtask.rst.
Browse files Browse the repository at this point in the history
  • Loading branch information
tomkraljevic committed May 27, 2014
1 parent 56cf622 commit 9426946
Show file tree
Hide file tree
Showing 3 changed files with 44 additions and 0 deletions.
13 changes: 13 additions & 0 deletions h2o-docs/source/developuser/mrtask.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Job/MRTask/FJTask Overview
==========================

A GLM job gets broken down into MRTask2 tasks, and subsequently into
Fork/Join tasks. FJ tasks are spread across the cluster in a
logarithmic tree fashion with computation performed at the leaves and
results rolled up to the top.

.. image:: PngGen/pictures/GLMAlgoMem.png
:width: 90 %

|
|
29 changes: 29 additions & 0 deletions h2o-docs/source/developuser/parse.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
Parse Overview
==============

(Since HDFS is very popular, this example illustrates that data
source. H2O does support other sources of data, however.)

The parse process typically moves data as much as twice during
ingestion.

The first movement of the data occurs when the data is read from disk
(an f-chunk in the diagram below) and copied across the network to the
H2O node that requested the particular piece of data from the
filesystem.

.. image:: PngGen/pictures/DataIngestion.png
:width: 90 %

|
|
The data is then moved a second time from the H2O node where the raw
data gets parsed to the H2O node where the compressed data will reside
in a Fluid Vector chunk (a p-chunk in the diagram below).

.. image:: PngGen/pictures/Parse.png
:width: 90 %

|
|
2 changes: 2 additions & 0 deletions h2o-docs/source/developuser/top_developer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,3 +27,5 @@ H2O Software Architecture

h2o_stack
how_r_interacts_with_h2o
parse
mrtask

0 comments on commit 9426946

Please sign in to comment.