This repository has been archived by the owner on Nov 27, 2023. It is now read-only.
forked from amundsen-io/amundsen
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: Delta table partition watermarks (amundsen-io#1694)
* Install pyspark for dev work So now we can run pytest on a fresh clone. Due to the rather old version this will throw some DeprecationWarning messages, but we can upgrade to 3.1 at a later stage. Signed-off-by: Harm Weites <[email protected]> * Read watermarks for Delta tables Signed-off-by: Harm Weites <[email protected]> * Include tests Signed-off-by: Harm Weites <[email protected]> * More proper watermark yielding Signed-off-by: Harm Weites <[email protected]> * Select the partition_column Going with the first item of the returned list will return the same column, which is not deterministic at all (given there are multiple partitions). Signed-off-by: Harm Weites <[email protected]> * Cut the line length Signed-off-by: Harm Weites <[email protected]> * Only process partitions of a workable type Since watermarking strings doesn't make much sense, keep to checking integer/float/date/datetime types. Signed-off-by: Harm Weites <[email protected]> * Updated tests Signed-off-by: Harm Weites <[email protected]> * Oops, the .first() returns a Row object Signed-off-by: Harm Weites <[email protected]> * Wrap this extraction in a try/except There are scenarios where a dataset exists, but is empty. In this case .first() will fail. Signed-off-by: Harm Weites <[email protected]> * Flake8 fixes Signed-off-by: Harm Weites <[email protected]> * Simplicity Signed-off-by: Harm Weites <[email protected]> * Revert "Simplicity" This reverts commit 06b9fc3. Working with this as part of job.launch() brings errors, where the original code would bring the desired result. Signed-off-by: Harm Weites <[email protected]> * Simplicity in return typing Signed-off-by: Harm Weites <[email protected]> * There is no complexity here :jedi_hand_wave: Signed-off-by: Harm Weites <[email protected]> * Pass the mypy Signed-off-by: Harm Weites <[email protected]> * Fix the return type here, finally Signed-off-by: Harm Weites <[email protected]> * Fix import sorting order Signed-off-by: Harm Weites <[email protected]>
- Loading branch information
Showing
3 changed files
with
214 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,3 +13,4 @@ pytest-cov>=2.12.0 | |
pytest-env>=0.6.2 | ||
pytest-mock>=3.6.1 | ||
typed-ast>=1.4.3 | ||
pyspark==3.0.1 |