Stars
re_data - fix data issues before your users & CEO would discover them 😊
A Model Context Protocol (MCP) server implementation for DuckDB, providing database interaction capabilities
Use DuckDB within Excel with the xlDuckDb addin
Advent of code - 30 challenges for learning Dagster
A configuration-driven framework for building Dagster pipelines that enables teams to create and manage data workflows using YAML/JSON instead of code
Configure and enforce conventions for your dbt project.
A lightweight tool for evaluating dbt-core selectors against any dbt project manifest.
A dbt-core plugin to weave together multi-project dbt-core deployments
Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
Typer, build great CLIs. Easy to code. Based on Python type hints.
Rich is a Python library for rich text and beautiful formatting in the terminal.
The data-validation toolkit for enhanced dbt (data build tool) PR review
A lightweight Python-based tool for extracting and analyzing data column lineage for dbt projects
Framework for building data agent workflows
Code for extracting, parsing and annotating tables from GitTables (https://gittables.github.io).
end-to-end data engineering project to get insights from PyPi using python, duckdb, MotherDuck & Evidence
Amazon Redshift Utils contains utilities, scripts and view which are useful in a Redshift environment
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
SQL Lineage Analysis Tool powered by Python