Abstract:
We consider the problem of clustering observations xi ∈ R
d
, i = 1, ..., n into k
possible clusters. We are mainly interested in clustering in the presence of outliers,
where classical clustering algorithms face challenges.
In the framework of center-based clustering that uses seeding method to initialize centroid and update the centroid in each iterations, we proposed the method
of Modified k-Means clustering. In Modified k-Means method, we introduce a new
sampling method for initialize the centroids where the Robust k-Means++ method
[1] has been tweaked in a straightforward and understandable way and a new centroid update strategy for avoiding the effect of outlier during centroid update stage.
Now use this Modified k-Means algorithm as building blocks we proposed Robust
center-based clustering algorithm that provides outlier detection and data clustering simultaneously. The proposed algorithm consists of two stages. The first stage
consists of Modified k-Means process, while the second stage iteratively remove the
points which are far away from their cluster center. The experimental results suggest
that our method has out performed this Robust k-Means++ [1] and also TMK++
[2] and local search (LSO) [3] on real world and synthetic data.