- London, United Kingdon
Stars
Builds Airflow DAGs from configuration files. Powers all DAGs on the Etsy Data Platform
Simple and powerful factories for mock data generation
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
vim, zsh, git, homebrew, neovim - my whole world
This dbt package contains macros to support unit testing that can be (re)used across dbt projects.
📊 Cube — Universal semantic layer platform for AI, BI, spreadsheets, and embedded analytics
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Synmetrix – production-ready open source semantic layer on Cube
Get ready for dotfiles. Contains i3, i3blocks, rofi, dunst, picom, vim, tmux, and zsh.
My dotfiles managed by GNU Stow - Arch, i3-gaps, bspwm, ncmpcpp, (neo)vim, zsh etc.
SQL upsert using pandas DataFrames for PostgreSQL, SQlite and MySQL with extra features
StreamAlert is a serverless, realtime data analysis framework which empowers you to ingest, analyze, and alert on data from any environment, using datasources and alerting logic you define.
A next-generation curated knowledge sharing platform for data scientists and other technical professions.
A toolkit providing a uniform interface for connecting to and extracting data from a wide variety of (potentially remote) data stores (including HDFS, Hive, Presto, MySQL, etc).
Augment Beancount importers with machine learning functionality.
Modern and easy to use SQL client for MySQL, Postgres, SQLite, SQL Server, and more. Linux, MacOS, and Windows.
A vault for securely storing and accessing AWS credentials in development environments
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Docker image for Airbnb's Superset
Repository for Docker Image of Apache-Superset. [Docker Image: https://hub.docker.com/r/abhioncbr/docker-superset]
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable,…
Fast iterative local development and testing of Apache Airflow workflows
Turbine: the bare metals that gets you Airflow