Recommend Song Base on Clustering :Project Overview

Created a tool that recommends the songs by using audio features to help the people to find the thai song base on their preference.
Optimized the Kmean algorithm using the Elbow method and Silhouette Score to compare between the original data and the data that have been transform form the PCA method.
Built a simple website using Flask and deployed on heroku.

Code and Resources

Python Version: 3.8

Packages: pandas, numpy, sklearn, matplotlib, seaborn, altair, flask, pickle

Data Source : http://organizeyourmusic.playlistmachinery.com/#

Deploy ML model on webpage tutorial : https://www.youtube.com/watch?v=i3RMlrx4ol4&t=487s

Push the Flasks Apps to Heroku tutorial : https://www.youtube.com/watch?v=Li0Abz-KT78&t=526s

Data Cleaning

Removed the unwanted column.
Droped the duplicates songs
Removed non thai song
Extracted the added year and month of the song

EDA

I look at the boxplot and see that some faetures have the different magnitude. Below are a few highlight from thr EDA notebook.

Model Building

First, I standardize the data because the data have different magnitude then I created the new data that come from the PCA method of 2 components.

I tried two different methods in evaluating the clustering that is elbow method and silhouette score. I choose the data from the PCA method because it has fewer distortion scores and more silhouette scores.

two different model:

Kmeans from the original data: number of cluster = 17, distortion score = 4017.89, average silhouette score = 0.12
Kmeans from the PCA data: number of cluster = 11, distortion score = 616.158, average silhouette score = 0.33

Productionization

In this step, I build a flask web application that was hosted on a cloud by Heroku. The web takes a request for the audio feature values. You can either enter the value by yourself or select the range between a little to a lot in the audio feature and input the genre you mostly like. The web will return a list of songs in the same cluster and the songs that have the same cluster and same genre that you choose.

Web :https://chinsongrecommend.herokuapp.com/

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.vscode		.vscode
Flaskapp		Flaskapp
__pycache__		__pycache__
README.md		README.md
Spotify.xlsx		Spotify.xlsx
audio feature in each year.png		audio feature in each year.png
audio_lineplot.png		audio_lineplot.png
boxplot.png		boxplot.png
df_clus		df_clus
heatmap.png		heatmap.png
km.pkl		km.pkl
pca.pkl		pca.pkl
scaler.pkl		scaler.pkl
spot_clean.py		spot_clean.py
spotify_clustering_analysis.ipynb		spotify_clustering_analysis.ipynb
spotify_eda.ipynb		spotify_eda.ipynb
twitter_card-default.jpg		twitter_card-default.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recommend Song Base on Clustering :Project Overview

Code and Resources

Data Cleaning

EDA

Model Building

Productionization

About

Releases

Packages

Languages

chinnapaht/cluster_recommend_proj

Folders and files

Latest commit

History

Repository files navigation

Recommend Song Base on Clustering :Project Overview

Code and Resources

Data Cleaning

EDA

Model Building

Productionization

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages