Data Clustering

Madhuri A. Dalal1 and Umesh L. Kulkarni2

1Saraswati College of Engineering, Kharghar, Navi Mumbai.

2Konkan Gyanpeeth College of Engineering, Karjat.


Data clustering is a process of putting similar data into groups. A clustering algorithm partitions a data set into several groups such that the similarity within a group is larger than among groups. This paper reviews three of the most representative off-line clustering techniques. The clustering problem has been addressed by researchers from many disciplines. However, efforts to perform effective and efficient clustering on large data sets only started in recent years with the emergence of data mining. The current paper presents an overview of clustering algorithms from a data mining perspective. The paper also describes a number of engineering applications to illustrate the potential of clustering algorithms as a tool for handling complex real-world problems.

Keywords: K-Means clustering, Data mining, Data clustering, Clustering, Applications.

