Skip to content

Data Engineer Pipeline Project using AWS S3 - GLUE - ATHENA -QUICKSIGHT

Notifications You must be signed in to change notification settings

Jira-saki/Data-Engineer-AWS-immersion-day

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Data Engineer Pipeline Project.

The ETL pipeline is using S3 as data Lake and AWS GLUE ETL as access datalake. Using Athena to query a view. Visualise with Quicksight

Project Background

Data engineering immersion day Project. The project will be completing the following tasks. Data Validation and ETL with Glue to be tables that can be queried using Amazon Athena and Visualize with Amazon Quciksight

Data architecture that needs to be created:



  1. Retrieve data from RDS Postgres and then save it into datalake in the form of csv file.
  2. Add Glue Clawler to create Data Catalog.
  3. Perform ETL using Glue Studio
  4. Create View with Athena
  5. Create a visualization using Quicksight to display a sport events graph.

Prerequisites

  • AWS Account
  • IAM resources permission policy setting for Glue, S3

Getting Started

  • Import the data set from RDS Postgres to Datalake
    Using AWS CLI to import data.

  • Datalake (S3)
    The files is storing in S3 "tickets" directory.


  • Add Clawler Process
    Create data catalog (database and tables) in Glue. Edit schema in each table.

  • Run job in Glue Studio
    Check incorrect schema and creat job to processed data in parquet format.


  • Create Glue Crawler for Parquet Files
    Add Crawler . Once crawler has finished running, Tables were added. Then and Run Crawler.




  • Create View (Athena)
    Query data and create a view with Amazon Athena Athena Workgroups to Control Query Access and Costs.




Visualization (Quicksight)

Data visualization is created from dataset in QuickSight

Follow Me On

https://www.linkedin.com/in/jirasak-pakdeeto-900665214/

About

Data Engineer Pipeline Project using AWS S3 - GLUE - ATHENA -QUICKSIGHT

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published