Stars
Create a data pipeline on AWS to execute batch processing in a Spark cluster provisioned by Amazon EMR. ETL using managed airflow: extracts data from S3, transform data using spark, load transforme…
Apache Spark notebook to perform ETL processes on OpenMRS data
Spark data pipeline that processes movie ratings data.
This is a simple ETL using Airflow. First, we fetch data from API (extract). Then, we drop unused columns, convert to CSV, and validate (transform). Finally, we load the transformed data to databas…
Airflow DAGs for the Stellar ETL project
Developed a data pipeline to automate data warehouse ETL by building custom airflow operators that handle the extraction, transformation, validation and loading of data from S3 -> Redshift -> S3
Educational project on how to build an ETL (Extract, Transform, Load) data pipeline, orchestrated with Airflow.
Real-time ETL pipeline for financial data (kafka, pyspark) .
Airflow DAGs for exporting, loading, and parsing the Ethereum blockchain data. How to get any Ethereum smart contract into BigQuery https://towardsdatascience.com/how-to-get-any-ethereum-smart-cont…
A docker image and kubernetes config files to run Airflow on Kubernetes
PySpark functions and utilities with examples. Assists ETL process of data modeling
Power Pop Health is a collection of content intended to simplify the process of ingesting and prepping Healthcare Open Data using Azure data tools and Power BI. Moving forward the overarching theme…
Udacity Data Engineering Nano Degree (DEND)
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal transactions. Data is available in…
This project covers the implementation of dockerizing a python flask based credit risk assessment calculator web application integrated with two different deep learning and transfer learning based …
A pipeline to CI/CD of a machine learning model on Google Cloud Run
This repo provides step to step guide to build CI/CD Pipeline on Azure ML
This is an AWS MLE and MLOps Bank Customers Churn Prediction Project.
ZenML 🙏: The bridge between ML and Ops. https://zenml.io.
A collection of full time roles in SWE, Quant, and PM for new grads.
Model to segment 3D MRI images using a 3D UNET based FCN architecture and convert it to a surface mesh. Please see the link below for the full paper.
Computer vision deep learning with medical images, MSc DS final project