This repository provides a high-performance implementation of the K-Means clustering algorithm, optimized for execution on NVIDIA GPUs using CUDA. The project focuses on leveraging shared memory and parallel reduction techniques to achieve significant performance improvements over traditional CPU-based approaches.
The project requires the following components:
- NVIDIA GPU with Compute Capability 6.0 or higher
- CUDA Toolkit (minimum version 11.0)
- C++ compiler supporting CUDA (such as
nvcc
from the CUDA Toolkit)
- Clone the repository:
git clone lorenzo-27/kmeans-cuda
cd kmeans-cuda
- Configure the algorithm parameters:
- Open
kmeans_config.py
- Adjust the clustering parameters according to your requirements
- Open
- Compile the project:
- If using CLion with CUDA support, the build process is automatically handled.
- For manual compilation, ensure you create a
cmake-build-release
directory or update the executable path inkmeans.py
- Run the program:
- Use the Python script kmeans.py to execute the compiled binary and manage datasets and results.
Note
Upon execution, the program automatically creates two directories:
- data/: Contains generated datasets
- results/: Stores performance plots and analysis tables
For a comprehensive understanding of the implementation and performance analysis, please refer to our detailed technical report available here. The report includes:
- Implementation details
- Performance benchmarks
- Experimental results and analysis
This project is licensed under the MIT License.