Clusters number calculating method for the k-means algorithm

Author(s):  V.V. Frolov, Dr., associate Professor, V.N. Karazin Kharkiv National University, Kharkiv, Ukraine, vvicfrol@rambler.ru

S.E. Slipchenko, no, no, NTU «KhPI», Kharkiv, Ukraine, serg.slip@gmail.com,

O.Yu. Prikhodko, candidate of Sciences, associate Professor, BSTU named after V.G. Shukhov, Belgorod, Russia, prihodko.o.u@gmail.com

Issue:  Volume 47 № 1

Rubric:  Infocommunication technologies

Annotation:  In article the method of estimate of optimum number of clusters for an algorithm k-means is offered. The method provides calculation of optimum number of clusters for partitioning an source set on the basis of the analysis of several evaluation criteria. The main criterion is dynamics of redistribution of objects in clusters upon transition from one partitioning towards another. Assessment of dynamics is carried out at calculation of norm of matrix of transition. As an additional criterion, an estimate of the change in the potential energy of objects inside clusters of the same partition is used. The auxiliary criterion determines number of clusters according to characteristic points of plots of the main and additional criteria. The essence of a method consists in a rules set of use of the main, additional and auxiliary criteria. The sequence of execution of rules is implemented by way of function of the Matlab system. Contrastive analysis shows that the method of integrated assessment allows to increase the accuracy of determination of optimum number of clusters by 40 %.

Keywords:  cluster analysis, cluster, clustering stability, partition of a set, partition quality criterion, k-means, cluster center, centroid.

Full text (PDF):  Download

Downloads count:  112