Stars
This repository provides a command line interface (CLI) utility that replicates an Amazon Managed Workflows for Apache Airflow (MWAA) environment locally.
Source for creating a basic ClickHouse cluster with sharding only.
The most popular ClickHouse plugin for Airflow. 🔝 Top-1% downloads on PyPI: https://pypi.org/project/airflow-clickhouse-plugin! Based on mymarilyn/clickhouse-driver.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Python library to interact with keepass databases (supports KDBX3 and KDBX4)
Curated list of resources about Apache Airflow
Run in all nodes of your cluster before the cluster starts - lets you customize your cluster
Cloud Dataflow Google-provided templates for solving in-Cloud data tasks
Common solutions and tools developed by Google Cloud's Professional Services team. This repository and its contents are not an officially supported Google product.
Columnar storage extension for Postgres built as a foreign data wrapper. Check out https://github.com/citusdata/citus for a modernized columnar storage implementation built as a table access method.
Source code accompanying: BigQuery: The Definitive Guide by Lakshmanan & Tigani to be published by O'Reilly Media
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Labs and demos for courses for GCP Training (http://cloud.google.com/training).
GCP extensions for Jupyter and JupyterLab
Code templates to make working with Kubernetes feel like editing and debugging local code.
Code samples used on cloud.google.com