Credit to Konur Unyelioglu for his code & explanation of consensus clustering that helped me apply this project in real life. His original article Link to git repo for consensus clustering
Using Apache Spark MLlib clustering library to explore data sets via unsupervised machine learning techniques. Scripts to determine optimal number of clusters; compare performance between three clustering algorithms via consensus clustering; and finally running using a trick to examine feature importance for determining cluster via Random Forest feature importance vector.