This project was based on the video 'Stock Market Real-Time Data Analysis Using Kafka | End-To-End Data Engineering Project' by Darshil Parmar.
To simulate a real-time data processing pipeline using Apache Kafka with a local cluster using Docker containers.
- Docker;
- AWS account;
- Inside the Kafka container:
kafka-topics --bootstrap-server IP_CLUSTER:PORT --topic --create TOPIC_NAME --partitions N_PARTITIONS --replication-factor N_REPLICATION_FACTOR
- Tutorial: Step 1: Create your first S3 bucket
- stock-producer.ipynb file;
- stock-consumer.ipynb file;
- Tutorial: Running SQL queries using Amazon Athena