Comparative analysis of mid-point based and proposed mean based K-Means Clustering Algorithm for Data Mining

Abstract

In the original k-means algorithm the initial centroids are taken just randomly out of the input data set. But this random selection of initial centroids leads the computation of the algorithm into local optima. Each time the end clustering results will come out to be different. This is the limitation which needs to be dealt with in order to make the k-means algorithm more efficient. The mid –point is used as a metric for computing the initial centroids but this algorithm may be suitable for a wide variety of problems but it is not suitable for all kinds of problems. As it concentrates on calculating the mid-point of different subsets of the data set, so it is most suitable to problems where the input data is regularly or uniformly distributed across the space. But in the situations where the input data is irregular or non-uniformly distributed, this algorithm will not produce the appropriate results. This paper presents the mean as the metric for choosing initial centroid and the comparison of both the algorithms.

Authors and Affiliations

Kirti Aggarwal, Neha Aggarwal

Keywords

Related Articles

Relationship Between Stress and Quality of Worklife of School Teachers Based on Demographic Variables

The aim of this study is to investigate the differences in stress and QWL based on demographic variables such as gender, experience and the organization type. Questionnaire data was collected through simple random sampli...

Design and Analysis of CMOS Based DADDA Multiplier

Multiplier is an important circuit used in electronic industry especially in digital signal processing operations such as filtering, convolution and analysis of frequency. There are different types of algorithms used in...

Enhanced Compression Code for SOC Test Data Volume Reduction

Test data reduction is an important issue for the system-on-a-chip designs. A number of coding techniques have been developed in the past to compress the test data to achieve the best compression. In this paper we have m...

Big Data Analysis using R and Hadoop

The way Big data - heavy volume, highly volatile, vast variety and complex data - has entered our lives, it is becoming day by day difficult to manage and gain business advantages out of it. This paper describes as what...

Computer Based Training

In this paper, we start by providing an overview of the main components of intelligent tutoring systems. We then provide a brief summary of different types of ITS. Next, we present a detailed discussion of two components...

Download PDF file
  • EP ID EP97983
  • DOI -
  • Views 140
  • Downloads 0

How To Cite

Kirti Aggarwal, Neha Aggarwal (2012). Comparative analysis of mid-point based and proposed mean based K-Means Clustering Algorithm for Data Mining. International Journal of Computational Engineering and Management IJCEM, 15(4), 71-74. https://europub.co.uk./articles/-A-97983