Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Data Engineering with Databricks Cookbook, published by Packt
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Lista de eventos tech que acontecem no Brasil
This repository exemplifies a simple ELT process using delta to perform upsert and remove data files that aren't in the latest state of the transaction log for the table.
A tool for exploring each layer in a docker image
Dashboards and notebooks in a single place. Create powerful and flexible dashboards using code, or build beautiful Notion-like notebooks and share them with your team.
A map transformer which implements the `Stream Maps` capability from Meltano's tap and target SDK: https://sdk.meltano.com/
A Singer tap for extracting data from Github. Powered by the Meltano SDK for Singer Taps: https://sdk.meltano.com
The single source of truth for all Meltano plugins, including all available Singer Taps and Targets: https://hub.meltano.com
Big Data Ecosystem Docker
Tradução do livro Pense em Python (2ª ed.), de Allen B. Downey
IGTI MBA Engenharida de dados - Bootcamp Engenheiro de Dados Cloud - Desafio final
Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Write 70% less code by using the SDK to build custom extractors and loaders that adhere to the Singer standard: https://sdk.meltano.com
A very simple Salesforce.com REST API client for Python
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
Example code for Fluent Python, 2nd edition (O'Reilly 2022)
Data Engineering with AWS, 2nd edition - Published by Packt