-
EnergyAustralia
- Melbourne, Australia
Starred repositories
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
An open-source framework that simplifies implementation of data solutions.
Learn how to design, develop, deploy and iterate on production-grade ML applications.
Python packaging and dependency management made easy
A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
A dbt package for modelling dbt metadata. https://brooklyn-data.github.io/dbt_artifacts
Efficient data transformation and modeling framework that is backwards compatible with dbt.
💯 Curated coding interview preparation materials for busy software engineers
a curated list of docker-compose files prepared for testing data engineering tools, databases and open source libraries.
Any Airflow project day 1, you can spin up a local desktop Kubernetes Airflow environment AND one in Google Cloud Composer with tested data pipelines(DAGs) 🖥️ >> [ 🚀, 🚢 ]
Docker base images for Ruby, Python, Node.js and Meteor web apps
Master programming by recreating your favorite technologies from scratch.
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
A complete computer science study plan to become a software engineer.
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
Kafka Connect Store Partitioner by custom fields and time
Public interface definitions of Google APIs.
A series of exercises to apply your Scala knowledge
The Data Explorer gives you fast, safe access to data stored in Cassandra, Dynomite, and Redis.
My Insight Data Engineering Fellowship project. I implemented a big data processing pipeline based on lambda architecture, that aggregates Twitter and US stock market data for user sentiment anal…
Containerized Spark, ready to use with AWS infrastructure
Flink CDC is a streaming data integration tool
A simple Spark-powered ETL framework that just works 🍺