Skip to content

Latest commit

 

History

History
 
 

2020-04-23 | Multi-hop Delta Lake Streaming

Predictive Maintenance (PdM) on IoT Data for Early Fault Detection w/ Delta Lake

2020-04-23 | Watch the video | This folder contains the presentation and sample notebooks

Note: Multiple personas are involved along different points of a data pipeline

This demo is a multi-notebook approach keeping these personas in mind i.e. there is division of labor where Data Investigation tasks are demoed in separate notebooks to emphasize that you can be involved in a part or the whole pipeline depending on what you choose.

Notebook Organization:

Include - Is a notebook that is included in other notebooks that defines base parameters
If you wish to change organization of the file paths, set them here

Setup & Teardown - Are companion notebook that have helper routines for setup & teardown
Run this to create/tear down the setup; It has instructions on cloud infrastructure setup

  • 1-Data Ingest - reads the incoming streaming data to land it in the bronze zone
    • 1a-Read Bronze - is a companion notebook which simulates a persona downstream consuming data from bronze table
  • 2-Data Refinement - Is the following notebook that reads the incoming stream from bronze table, refines it and lands it into silver zone
    • 2a-Read Silver - is a companion notebook which simulates a persona downstream consuming data from silver table
  • 3-Data Rollup - Is the following notebook that reads the incoming stream from silver table, refines it and lands it into gold zone
    • 3a-Read Gold - is a companion notebook which simulates a persona downstream consuming data from silver table