What is K-Medoids Clustering? K-Medoids Clustering Explained
K-medoids clustering is a variation of the K-means clustering algorithm that uses representative data points, called medoids, instead of centroids to define cluster centers. While K-means calculates the mean of data points within a cluster, K-medoids selects one of the actual data points as the medoid, which is typically the most centrally located point within the cluster.
The steps involved in K-medoids clustering are similar to K-means:
Initialization: Randomly select K data points as the initial medoids.
Assignment step: For each data point, calculate the dissimilarity (distance or dissimilarity metric) between the point and each medoid. Assign the data point to the cluster represented by the closest medoid.
Update step: For each cluster, evaluate the total dissimilarity of all data points to their medoid. Consider swapping the medoid with another data point within the same cluster and recalculate the total dissimilarity. Choose the data point that minimizes the total dissimilarity as the new medoid for that cluster.
Repeat steps 2 and 3: Iteratively perform the assignment and update steps until convergence. Convergence occurs when the medoids no longer change or when a maximum number of iterations is reached.
Final clustering: Once convergence is achieved, the algorithm outputs the final clustering, where each data point belongs to a specific cluster based on its distance to the nearest medoid.
Compared to K-means, K-medoids clustering has a couple of advantages. First, it is more robust to outliers and noise since the medoids are actual data points within the dataset. Second, it can handle non-Euclidean distance measures or dissimilarity matrices, making it more versatile in various applications.
However, K-medoids clustering can be computationally more expensive than K-means since it requires evaluating all possible medoid swaps within each iteration. As a result, it is typically slower than K-means, especially for large datasets.
K-medoids clustering has similar applications to K-means clustering, including customer segmentation, image analysis, document clustering, and pattern recognition. It is particularly useful when the choice of representative points (medoids) is critical or when dealing with datasets with noise and outliers.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.