Efficiency of K-Means Clustering Algorithm in Mining Outliers from Large Data Sets

Journal Title: International Journal on Computer Science and Engineering - Year 2010, Vol 2, Issue 9

Abstract

This paper presents the performance of k-means clustering algorithm, depending upon various mean values input methods. Clustering plays a vital role in data mining. Its main job is to group the similar data together based on the haracteristic they possess. The mean values are the centroids of the specified number of cluster groups. The centroids, though gets changed during the process of clustering, are alculated using several methods. Clustering algorithms can be applied for image analysis, pattern recognition, bio-informatics and in several other fields. The clustering algorithm consists to two stages with first stage forming the clusters-calculating centroid and the second stage determining the outliers. There are three methods for assigning the mean values in k-means clustering algorithm. The three mean value assignment methods are implemented, performance is analysed and comparison of every method is done. Outliers, the is advantage of the process are used in the analyzation to determine the performance with various mean inputs and methods.

Authors and Affiliations

Sridhar. A , Sowndarya. S

Keywords

Related Articles

Significance of Web 2.0 in Digital Libraries

The rapid development of Information and Communication Technologies has provided a well sophisticated environment to develop the digital libraries. A digital library is a large-scale, organized collection of complex and...

Performability Measures Of Multiple Path Multistage Interconnection Networks

In this paper, attempts have been made to develop different combinatorial models for evaluation performability of various multiple path multistage interconnection networks (MINs). For the purpose here, two metrics of per...

Improvement in Word Sense Disambiguation by introducing enhancements in English WordNet Structure

Word sense disambiguation (WSD) is an open problem of natural language processing, which governs the process of identifying the appropriate sense of a word (i.e. intended meaning) in a sentence, when the word has multipl...

Comprehensive Survey on DDOS attack with its mitigation Techniques

The remarkable development and accomplishment of Internet has changed the way customary vital administrations, for example, managing an account, transportation, prescription, training and guard are worked. Presently they...

Multi-agent Collaboration Engine for Supply Chain Management

To improve the performance of Supply chain decisions (SCM) , agents technology is slowly becoming the best alternative . In general , most of the components in SCM work in isolation and achieving coordination among SCM p...

Download PDF file
  • EP ID EP97295
  • DOI -
  • Views 133
  • Downloads 0

How To Cite

Sridhar. A, Sowndarya. S (2010). Efficiency of K-Means Clustering Algorithm in Mining Outliers from Large Data Sets. International Journal on Computer Science and Engineering, 2(9), 3043-3045. https://europub.co.uk./articles/-A-97295