Skip to content

This is a datawarehouse created in the techstack with Airlflow, mySQL and Docker.

Notifications You must be signed in to change notification settings

alexmamani01/datawarehouse

 
 

Repository files navigation

Datawarehouse

Building data warehouse techstack

Created a database called a datawarehouse database. In the database there are tabled that are created for the purposes of this project scope.

Consisting of A “data warehouse” (mysql, sqlite) An orchestration service (Airflow) An ELT tool (dbt) A reporting environment (redash) Set it up locally using fully docker.

Create a DAG in Airflow that uses the bash/python operator to load the data files into your database. Think about a useful separation of Prod, Dev and Staging Connect dbt with your DWH and write transformations codes for the data you can execute via the Bash or Python operator in Airflow. Write proper documentation for your data models and access the dbt docs UI for presentation. Check additional modules of dbt that can support you with data quality monitoring (e.g. great_expectations, dbt_expectations or re-data). Connect the reporting environment and create a dashboard out of this data Write a short article about your approach and what were the most important decisions along the way.

Airflow tutorial Master

This is an example of how airflow works. In this we have created the first DAGS and we have used GCP to show how the airflow works

dbt

About

This is a datawarehouse created in the techstack with Airlflow, mySQL and Docker.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 68.2%
  • Python 31.4%
  • Other 0.4%