Clustering Algorithm for mixed datatypes - K-Prototypes

Name: Clustering Algorithm for mixed datatypes - K-Prototypes
Uploaded: Apr 18, 2020
Duration: 707 s

AIEngineering77.6K subscribers

48.9K views

Apr 18, 2020

11:47

#datascience #machinelearning #ml The k-means based methods are efficient for processing large data sets, but they are often limited to numeric data. Kmeans optimize a cost function defined on the Euclidean distance measure between data points and means of clusters. Minimizing the cost function by calculating means limits their use to numeric data. This is where K-Prototype shines. When applied to numeric data the algorithm is identical to k-means. For categorical data algorithm uses a simple matching dissimilarity measure , replaces the means of clusters with modes, and uses a frequency-based method to update modes in the clustering process to minimize the clustering cost function.

Download

0 formats

No download links available.