Skip to content

Coursework and projects from Udacity's Data Engineer NanoDegree

License

Notifications You must be signed in to change notification settings

jmalinao19/Data-Engineer-ND-Projects

Repository files navigation

Data Engineer Nanodegree

Course repository of projects from Udacity's Data Engineer Nanodegree. Feel free to check out more about the course and program on Udacity's website or check out the course syllabus

Concepts Covered

  • Course 1: Data Modeling

  • Course2: Cloud Data Warehouses

    • Data Warehouse Archiecture
    • Dimensional Modeling
    • Denormalizing 3NF database to Star Schema with ETL process
    • OLAP Cube and its Operations: Roll-up, drill-down, Slice & Dice, Pivot
    • Cloud Computing with Amazon Web Services (AWS)
    • AWS Redshift Archictecture
    • Set up AWS infrastucture as code (IaC)
    • Optimized table design with distribution style and sorting key
    • Technologies Utilized:
      • Python, AWS (EC2, S3, IAM, VPC, RDS PostgreSQL, Redshift)
    • Project 3: Data Warehouse with AWS Redshift
  • Course 3: Data Lakes with Spark

    • Big Data Ecosystem
    • Distributed Systems
    • Data Wrangling with Spark
    • Data Lakes
    • Technologies Utilized: *_Python, PySpark, AWS (EC2, S3, IAM, EMR, Redshift)
    • Project 4: Data Lakes with AWS EMR and PySpark

About

Coursework and projects from Udacity's Data Engineer NanoDegree

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published