Data Engineer Pipeline Project.

The ETL pipeline is using S3 as data Lake and AWS GLUE ETL as access datalake. Using Athena to query a view. Visualise with Quicksight

Project Background

Data engineering immersion day Project. The project will be completing the following tasks. Data Validation and ETL with Glue to be tables that can be queried using Amazon Athena and Visualize with Amazon Quciksight

Data architecture that needs to be created:

Retrieve data from RDS Postgres and then save it into datalake in the form of csv file.
Add Glue Clawler to create Data Catalog.
Perform ETL using Glue Studio
Create View with Athena
Create a visualization using Quicksight to display a sport events graph.

Prerequisites

AWS Account
IAM resources permission policy setting for Glue, S3

Getting Started

Import the data set from RDS Postgres to Datalake
Using AWS CLI to import data.
Datalake (S3)
The files is storing in S3 "tickets" directory.
Add Clawler Process
Create data catalog (database and tables) in Glue. Edit schema in each table.
Run job in Glue Studio
Check incorrect schema and creat job to processed data in parquet format.
Create Glue Crawler for Parquet Files
Add Crawler . Once crawler has finished running, Tables were added. Then and Run Crawler.
Create View (Athena)
Query data and create a view with Amazon Athena Athena Workgroups to Control Query Access and Costs.

Visualization (Quicksight)

Data visualization is created from dataset in QuickSight

Follow Me On

https://www.linkedin.com/in/jirasak-pakdeeto-900665214/

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Code		Code
image		image
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Engineer Pipeline Project.

Project Background

Prerequisites

Getting Started

Visualization (Quicksight)

Follow Me On

About

Releases

Packages

Jira-saki/Data-Engineer-AWS-immersion-day

Folders and files

Latest commit

History

Repository files navigation

Data Engineer Pipeline Project.

Project Background

Prerequisites

Getting Started

Visualization (Quicksight)

Follow Me On

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages