Skip to content

Here I added 9 projects which have been made by me during my apprenticeship in Yandex.Practicum as data engineer.

Notifications You must be signed in to change notification settings

makarov-m/Yandex.Practicum.DE

Repository files navigation

Yandex.Practicum.DE

Here I added 9 projects which have been made by me during my apprenticeship in Yandex.Practicum. Link for the platform https://practicum.yandex.ru/data-engineer/

Table of content:

10. Final
Topic: Batch processing
Task: Create a pipeline, which retrieves data from Postgres for the given time period and uploads data extract to the Vertica
Tools: Postgres, Airflow, Vertica, Python, Metabase
Results: Stable pipeline has been developed

9. Micro services
Topic: Cloud services
Task: Receive real-time data from the Kafka broker, process and decompose into different layers of the data warehouse
Tools: Kafka, Postgres, Redis, kubernetes, Python
Results: Three microservices has been developed

8. Restaurant promotions
Topic: Real time data processing
Task: Receive messages from Kafka, process and send to two receivers: a Postgres database and a new topic for the Kafka broker
Tools: PySpark, Kafka, Postgres
Results: Created real time data processing pipeline

7. Social network
Topic: Data Lake
Task: Create data marts on regular basis in Apache Hadoop file system
Tools: PySpark, Hadoop, Airflow
Results: Created 3 spark jobs and scheduled with DAG in Airflow

6. Analytical datawarehouse
Topic: Data Vault
Task: Build an analytical storage based on Vertica using Data Vault storage model.
Tools: Python, Docker, Vertica, Airflow
Results: Created analytical datawarehouse with 2 layares based on Verica DB

5. No project

4. Settlements with couriers
Topic: ETL pipepline creation
Task: Load data from external API to the local DB with Apache Airflow
Tools: Python, Docker, PostgreSQL, MongoDB, Airflow
Results: Scripts for data loading have been created

3. Online store
Topic: ETL pipepline creation
Task: Change existing pipeline considering modifications in DB
Tools: Python, Docker, PostgreSQL, Airflow
Results: Scripts for data loading have been modified

2. Online store
Topic: optimize DB structure
Task: re-build DB schema
Tools: Docker, PostgreSQL
Results: Scripts for changing DB structure

1. User Segmentation
Topic: datamarts building
Task: Create RFM segmentation in local DB
Tools: Docker, PostgreSQL
Results: Scripts for user segmentation have been created

About

Here I added 9 projects which have been made by me during my apprenticeship in Yandex.Practicum as data engineer.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published