Preserving Data Clustering with Expectation Maximization Algorithm
Journal Title: Journal of Information Systems and Telecommunication - Year 2016, Vol 4, Issue 3
Abstract
Data mining and knowledge discovery are important technologies for business and research. Despite their benefits in various areas such as marketing, business and medical analysis, the use of data mining techniques can also result in new threats to privacy and information security. Therefore, a new class of data mining methods called privacy preserving data mining (PPDM) has been developed. The aim of researches in this field is to develop techniques those could be applied to databases without violating the privacy of individuals. In this work we introduce a new approach to preserve sensitive information in databases with both numerical and categorical attributes using fuzzy logic. We map a database into a new one that conceals private information while preserving mining benefits. In our proposed method, we use fuzzy membership functions (MFs) such as Gaussian, P-shaped, Sigmoid, S-shaped and Z-shaped for private data. Then we cluster modified datasets by Expectation Maximization (EM) algorithm. Our experimental results show that using fuzzy logic for preserving data privacy guarantees valid data clustering results while protecting sensitive information. The accuracy of the clustering algorithm using fuzzy data is approximately equivalent to original data and is better than the state of the art methods in this field.
Authors and Affiliations
Leila Jafar Tafreshi, Farzin Yaghmaee
Scalable Community Detection through Content and Link Analysis in Social Networks
Social network analysis is an important problem that has been attracting a great deal of attention in recent years. Such networks provide users many different applications and features; as a result, they have been mentio...
A Global-Local Noise Removal Approach to Remove High Density Impulse Noise
Impulse noise removal from images is one of the most important concerns in digital image processing. Noise must be removed in a way that the main and important information of image is kept. Traditionally, the median filt...
Improving Accuracy, Area and Speed of Approximate Floating-Point Multiplication Using Carry Prediction
The arithmetic units are the most essential in digital circuits’ construct, and the enhancement of their operation would optimize the whole digital system. Among them, multipliers are the most important operational units...
A new Sparse Coding Approach for Human Face and Action Recognition
Sparse coding is an unsupervised method which learns a set of over-complete bases to represent data such as image, video and etc. In the cases where we have some similar images from the different classes, using the spars...
Fast Automatic Face Recognition from Single Image per Person Using GAW-KNN
Real time face recognition systems have several limitations such as collecting features. One training sample per target means less feature extraction techniques are available to use. To obtain an acceptable accuracy, mos...