GitHub - raleighgee/kmodes at d3be8ed3840ef6d4b57948c2f803fe8a817845bb

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
examples		examples
kmodes		kmodes
.coveragerc		.coveragerc
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
README.rst		README.rst
requirements.txt		requirements.txt

Repository files navigation

kmodes

Description

Python implementations of the k-modes and k-prototypes clustering algorithms. Relies on numpy for a lot of the heavy lifting.

k-modes is used for clustering categorical variables. It defines clusters based on the number of matching categories between data points. (This is in contrast to the more well-known k-means algorithm, which clusters numerical data based on Euclidean distance.) The k-prototypes algorithm combines k-modes and k-means and is able to cluster mixed numerical / categorical data.

Implemented are:

k-modes [1][2]
k-modes with initialization based on density [3]
k-prototypes [1]

The code is modeled after the k-means module in scikit-learn and has the same familiar interface.

Usage examples of both k-modes ('soybean.py') and k-prototypes ('stocks.py') are included.

I would love to have more people play around with this and give me feedback on my implementation.

Enjoy!

Usage

```python import numpy as np from kmodes import kmodes

# random categorical data data = np.random.choice(20, (100, 10))

km = kmodes.KModes(n_clusters=4, init='Huang', n_init=5, verbose=1) km.fit_predict(data) ```

References

[1] Huang, Z.: Clustering large data sets with mixed numeric and categorical values, Proceedings of the First Pacific Asia Knowledge Discovery and Data Mining Conference, Singapore, pp. 21-34, 1997.

[2] Huang, Z.: Extensions to the k-modes algorithm for clustering large data sets with categorical values, Data Mining and Knowledge Discovery 2(3), pp. 283-304, 1998.

[3] Cao, F., Liang, J, Bai, L.: A new initialization method for categorical data clustering, Expert Systems with Applications 36(7), pp. 10223-10228., 2009.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kmodes

Description

Usage

References

About

Releases

Packages

Languages

License

raleighgee/kmodes

Folders and files

Latest commit

History

Repository files navigation

kmodes

Description

Usage

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages