Skip to content
View aimanmuhamad's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report aimanmuhamad

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An Awesome List of Open-Source Data Engineering Projects

2,193 358 Updated Oct 4, 2024

Implementing best practices for PySpark ETL jobs and applications.

Python 1,730 723 Updated Jan 1, 2023

Data Engineering Practice Problems

Python 1,811 516 Updated Oct 31, 2024

Data Engineering Handbook for beginners and everyone

Makefile 30 4 Updated Jul 13, 2024

Chronon is a data platform for serving for AI/ML applications.

Scala 756 56 Updated Jan 3, 2025

Repository with code examples of mlflow

Jupyter Notebook 66 23 Updated Dec 6, 2024

Singer.io Tap for MongoDB - PipelineWise compatible

Python 5 24 Updated Sep 2, 2024

Data pipeline for uploading, preprocessing, and visualising COVID19 data

Python 18 1 Updated Apr 1, 2023

The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)

Python 219 56 Updated Dec 20, 2024

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Java 10,656 3,054 Updated Jan 4, 2025

Demo showing how the dlt load_info can be used to create a data lineage overview.

Python 3 Updated Nov 28, 2023

Free MLOps course from DataTalks.Club

Jupyter Notebook 11,286 2,178 Updated Sep 9, 2024

📊 Cube — Universal semantic layer platform for AI, BI, spreadsheets, and embedded analytics

Rust 18,106 1,786 Updated Jan 4, 2025

Google Machine Learning for Solutions Architects, published by Packt

Jupyter Notebook 26 17 Updated Jul 10, 2024

Data Engineering in Bioinformatics, published by Packt

Python 2 3 Updated Nov 27, 2023

AI for DevOps and Site Reliability Engineering, published by Packt

Python 7 3 Updated Oct 29, 2024

Mastering Python Design Patterns, Third Edition by Packt Publishing

Python 62 26 Updated May 22, 2024

Polars Cookbook, Published by Packt

Jupyter Notebook 309 46 Updated Oct 28, 2024

Building ETL Pipelines with Python

Jupyter Notebook 116 162 Updated Jul 12, 2024
Jupyter Notebook 197 48 Updated Oct 28, 2024

Best Practices on Recommendation Systems

Python 19,558 3,132 Updated Jan 5, 2025

An open-source project dedicated to constructing robust data pipelines and scalable software infrastructure. We leverage industry-standard tools favored by developers to enhance efficiency and reli…

Python 33 3 Updated May 27, 2024

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

Python 2,886 190 Updated Jan 3, 2025

A template repository for deploying DBT to Cloud Run

HTML 6 3 Updated Mar 22, 2024

All-in-one Modern Data Stack (MDS) in a box

3 Updated Dec 4, 2024

Food for thoughts around data contracts

Python 24 6 Updated Nov 15, 2024

Template for a data contract used in a data mesh.

467 48 Updated Mar 13, 2024
Next