A Comparative Analysis of Classification Algorithms on Diverse Datasets
Journal Title: Engineering, Technology & Applied Science Research - Year 2018, Vol 8, Issue 2
Abstract
Data mining involves the computational process to find patterns from large data sets. Classification, one of the main domains of data mining, involves known structure generalizing to apply to a new dataset and predict its class. There are various classification algorithms being used to classify various data sets. They are based on different methods such as probability, decision tree, neural network, nearest neighbor, boolean and fuzzy logic, kernel-based etc. In this paper, we apply three diverse classification algorithms on ten datasets. The datasets have been selected based on their size and/or number and nature of attributes. Results have been discussed using some performance evaluation measures like precision, accuracy, F-measure, Kappa statistics, mean absolute error, relative absolute error, ROC Area etc. Comparative analysis has been carried out using the performance evaluation measures of accuracy, precision, and F-measure. We specify features and limitations of the classification algorithms for the diverse nature datasets.
Authors and Affiliations
M. Alghobiri
Numerical Simulation of a Single-Phase Flow Through Fractures with Permeable, Porous and Non-Ductile Walls
This paper attempts to study flows within fractures through a set of numerical simulations. In addition, a special care is given to hydraulic features and characteristics of fractures. The research is performed through t...
Modeling Photovoltaic Power
A robust and reliable model describing the power produced by a photovoltaic system is needed in order to be able to detect module failures, inverter malfunction, shadowing effects and other factors that may result to ene...
Evaluation of the Acoustic Properties of Wood-Plastic-Chalk Composites
Wood-plastic composites are a new group of materials that can be used in construction instead of wood and plastic. They are used in various industries due to features such as sound and water absorption, among others. Thi...
Auditor Tenure and Accounting Conservatism: Testing Moderating Effect of Owner’s Importance
Conservatism is one of the salient features of financial reporting that has attracted more attention in recent years because of financial scandals. Several recent studies have focused specifically on conservatism. This r...
Fat Quantitation in Liver Biopsies Using a Pretrained Classification Based System
Non-Alcoholic Fatty Liver Disease (NAFLD) is a common syndrome that mainly leads to fat accumulation in liver and steatohepatitis. It is targeted as a severe medical condition ranging from 20% to 40% in adult populations...