-
Thang Long University
- Ha Noi City
- https://longNguyen010203.github.io
- in/long-nguyen-de203
-
-
Design and implement a data warehouse to manage automobile accident cases across all 49 states in the US, using a star schema and Snowflake for the data warehouse architecture.
-
-
Develop a real-time data ingestion pipeline using Kafka and Spark. Collect minute-level stock data from Yahoo Finance, ingest it into Kafka, and process it with Spark Streaming, storing the resultsβ¦
-
A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Docker. Data from kaggle and youtube-api
-
100Day-Self-Learning-DE Public
ππ»β¨ Self-study process for more than 3 months with 3-4h/day to prepare for the journey of applying for an intern or fresher position as a Data Engineer in 2024 οΈπ₯οΈπ
-
-
workshop on data engineering using Amazon Kinesis for real-time data processing, applied to big data processing.
-
workshop on data engineering using Amazon EMR for batch data processing, application to big data processing.
-
βοΈππ₯ Welcome to my AWS Cloud Training repository! This repo contains notes, exercises, and projects from my AWS Cloud training journey, showcasing my progress and understanding of AWS services. π¨
-
Spark-Kafka-Self-Learning Public
πππ A third-year student is self-studying Spark and Kafka as part of their π· data engineering journey, with the goal of securing an π¬ internship or fresher job in 2024.
-
-
-
Spark-Processing-AWS Public
π·π Set up and build a big data processing pipeline with Apache Spark, π¦ AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate worβ¦
-
-
ECommerce-ELT-Pipeline Public
πππ A Data Engineering Project π that implements an ELT data pipeline using Dagster, Docker, Dbt, Polars, Snowflake, PostgreSQL. Data from kaggle website π₯
-
-
Bank-DataWarehouse Public
πππ This project develop a data warehouse for a bank using Amazon Redshift, VPC, Glue, S3 and DBT, following a β Star Schema architecture. The goal is to storage, manage, and optimize data to suppoβ¦
2 UpdatedJun 8, 2024 -
Zillow-Home-Value-Prediction Public
πππ The Zillow Home Value Prediction project employs linear regression models on Kaggle datasets to forecast house prices. ππ°Using Apache Spark (PySpark) within a Docker setup enables efficient datβ¦
-
InspireAI-Web-2024 Public
π€ππΊ This project involves creating an AI chatbot with OpenAI using ChatGPT, DALL-E, Codex, and Django to develop the web application π
-
-
FDE-Course-2024-W4-DBT Public
π»πFundamental Data Engineering Course 2024 Week4 Learn DBT Transform Data with Models, Macro, ELT-Pipeline with Dagster π