Final Project of CSC4160 at CUHK(SZ): Cloud-Based Distributed MapReduce System
This project implements a cloud-based distributed MapReduce system inspired by Google’s original MapReduce framework. It demonstrates scalability, fault tolerance, and integration with cloud services for large-scale data processing.
The demostration is provide via this link, we recommend to turn on English Subtitles when watching.
- Go (1.20 or later) installed on your system.
- Access to a cloud environment such as AWS EC2 for multi-node execution (optional).
- Source code cloned from the Github Repository.
-
Navigate to the
src/main
directory:cd src/main
-
Build the WordCount plugin:
go build -buildmode=plugin ../mrapps/wc.go
-
Clean up previous outputs (if any):
rm mr-out*
-
Run the WordCount task in sequential mode:
go run mrsequential.go wc.so pg*.txt
-
View the output:
more mr-out-0