forked from databricks/devrel
-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Including presentation and slides from previous tech-talks
- Loading branch information
Showing
21 changed files
with
1,954 additions
and
0 deletions.
There are no files selected for viewing
13 changes: 13 additions & 0 deletions
13
...2-27 | Getting Data Ready for Data Science with Delta Lake and MLflow/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
## Getting Data Ready for Data Science with Delta Lake | ||
|
||
2020-02-27 | [Watch the video](https://www.youtube.com/watch?v=hQaENo78za0) | This folder contains the presentation and sample notebooks | ||
|
||
One must take a holistic view of the entire data analytics realm when it comes to planning for data science initiatives. Data engineering is a key enabler of data science helping furnish reliable, quality data in a timely fashion. Delta Lake, an open-source storage layer that brings reliability to data lakes can help take your data reliability to the next level. | ||
|
||
In this session you will learn about: | ||
* The data science lifecycle | ||
* The importance of data engineering to successful data science | ||
* Key tenets of modern data engineering | ||
* How Delta Lake can help make reliable data ready for analytics | ||
* The ease of adopting Delta Lake for powering your data lake | ||
* How to incorporate Delta Lake within your data infrastructure to enable Data Science |
Binary file added
BIN
+3.57 MB
...Lambda - Introducing Delta Architecture/Beyond Lambda_ Introducing Delta Architecture.pdf
Binary file not shown.
42 changes: 42 additions & 0 deletions
42
...bda - Introducing Delta Architecture/Delta Architecture - Beyond Lambda Architecture.html
Large diffs are not rendered by default.
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
...da - Introducing Delta Architecture/Delta Architecture - Beyond Lambda Architecture.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
5 changes: 5 additions & 0 deletions
5
2020-03-05 | Beyond Lambda - Introducing Delta Architecture/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
## Beyond Lambda: Introducing Delta Architecture | ||
|
||
2020-03-05 | [Watch the video](https://www.youtube.com/watch?v=FePv0lro0z8) | This folder contains the presentation and sample notebooks | ||
|
||
Lambda architecture is a popular technique where records are processed by a batch system and streaming system in parallel. The results are then combined during query time to provide a complete answer. Strict latency requirements to process old and recently generated events made this architecture popular. The key downside to this architecture is the development and operational overhead of managing two different systems. There have been attempts to unify batch and streaming into a single system in the past. Organizations have not been that successful though in those attempts. But, with the advent of Delta Lake, we are seeing a lot of our customers adopting a simple continuous data flow model to process data as it arrives. We call this architecture, The Delta Architecture. In this session, we cover the major bottlenecks for adopting a continuous data flow model and how the Delta architecture solves those problems. |
11 changes: 11 additions & 0 deletions
11
...03-12 | Simplify and Scale Data Engineering Pipelines with Delta Lake/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
## Simplify and Scale Data Engineering Pipelines with Delta Lake | ||
|
||
2020-03-12 | [Watch the video](https://youtu.be/qtCxNSmTejk?t=190) | This folder contains the presentation and sample notebooks | ||
|
||
A common data engineering pipeline architecture uses tables that correspond to different quality levels, progressively adding structure to the data: data ingestion (“Bronze” tables), transformation/feature engineering (“Silver” tables), and machine learning training or prediction (“Gold” tables). Combined, we refer to these tables as a “multi-hop” architecture. It allows data engineers to build a pipeline that begins with raw data as a “single source of truth” from which everything flows. In this session, we will show how to build a scalable data engineering data pipeline using Delta Lake. Delta Lake is an open-source storage layer that brings reliability to data lakes. Delta Lake offers ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. It runs on top of your existing data lake and is fully compatible with Apache Spark APIs In this session you will learn about: | ||
|
||
* The data engineering pipeline architecture | ||
* Data engineering pipeline scenarios | ||
* Data engineering pipeline best practices | ||
* How Delta Lake enhances data engineering pipelines | ||
* The ease of adopting Delta Lake for building your data engineering pipelines |
42 changes: 42 additions & 0 deletions
42
...elines with Delta Lake/Simplify and Scale Data Engineering Pipelines with Delta Lake.html
Large diffs are not rendered by default.
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
...lines with Delta Lake/Simplify and Scale Data Engineering Pipelines with Delta Lake.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
Binary file added
BIN
+5.24 MB
...pelines with Delta Lake/Simplify and Scale Data Engineering Pipelines with Delta Lake.pdf
Binary file not shown.
Oops, something went wrong.