2020-04-23 | Watch the video | This folder contains the presentation and sample notebooks
Note: Multiple personas are involved along different points of a data pipeline
This demo is a multi-notebook approach keeping these personas in mind i.e. there is division of labor where Data Investigation tasks are demoed in separate notebooks to emphasize that you can be involved in a part or the whole pipeline depending on what you choose.
Include - Is a notebook that is included in other notebooks that defines base parameters
If you wish to change organization of the file paths, set them here
Setup & Teardown - Are companion notebook that have helper routines for setup & teardown
Run this to create/tear down the setup; It has instructions on cloud infrastructure setup
- 1-Data Ingest - reads the incoming streaming data to land it in the bronze zone
- 1a-Read Bronze - is a companion notebook which simulates a persona downstream consuming data from bronze table
- 2-Data Refinement - Is the following notebook that reads the incoming stream from bronze table, refines it and lands it into silver zone
- 2a-Read Silver - is a companion notebook which simulates a persona downstream consuming data from silver table
- 3-Data Rollup - Is the following notebook that reads the incoming stream from silver table, refines it and lands it into gold zone
- 3a-Read Gold - is a companion notebook which simulates a persona downstream consuming data from silver table