Classifying Arabic Text Using KNN Classifier

Abstract

With the tremendous amount of electronic documents available, there is a great need to classify documents automatically. Classification is the task of assigning objects (images, text documents, etc.) to one of several predefined categories. The selection of important terms is vital to classifier performance, feature set reduction techniques such as stop word removal, stemming and term threshold were used in this paper. Three term-selection techniques are used on a corpus of 1000 documents that fall in five categories. A comparison study is performed to find the effect of using full-word, stem, and the root term indexing methods. K-nearest – neighbors classifiers used in this study. The averages of all folds for Recall, Precision, Fallout, and Error-Rate were calculated. The results of the experiments carried out on the dataset show the importance of using k-fold testing since it presents the variations of averages of recall, precision, fallout, and error rate for each category over the 10-fold.

Authors and Affiliations

Amer Al-Badarenah, Emad Al-Shawakfa, Khaleel Al-Rababah, Safwan Shatnawi, Basel Bani-Ismail

Keywords

Related Articles

Adaptive e-learning using Genetic Algorithm and Sentiments Analysis in a Big Data System

In this article we describe our adaptive e-learning system, which allows the learner to take courses adapted to his profile and to the pedagogical objectives set by the teacher, we use for adaptation the genetic algorith...

Geographic Routing Using Logical Levels in Wireless Sensor Networks for Sensor Mobility

In this paper we propose an improvement to the GRPW algorithm for wireless sensor networks called GRPW-M , which collects data in a wireless sensor network (WSN) using a mobile nodes. Performance of GRPW algorithm algori...

Response Prediction for Chronic HCV Genotype 4 Patients to DAAs

Hepatitis C virus (HCV) is a major cause of chronic liver disease, end stage liver disease and liver cancer in Egypt. Genotype 4 is the prevalent genotype in Egypt and has recently spread to Southern Europe particularly...

Smartphones-Based Crowdsourcing Approach for Installing Indoor Wi-Fi Access Points

This study provides a new Crowdsourcing-based approach to identify the most crowded places in an indoor environment. The Crowdsourcing Indoor Localization system (CSI) has been one of the most used techniques in location...

RGBD Human Action Recognition using Multi-Features Combination and K-Nearest Neighbors Classification

In this paper, we present a novel system to analyze human body motions for action recognition task from two sets of features using RGBD videos. The Bag-of-Features approach is used for recognizing human action by extract...

Download PDF file
  • EP ID EP154286
  • DOI 10.14569/IJACSA.2016.070633
  • Views 104
  • Downloads 0

How To Cite

Amer Al-Badarenah, Emad Al-Shawakfa, Khaleel Al-Rababah, Safwan Shatnawi, Basel Bani-Ismail (2016). Classifying Arabic Text Using KNN Classifier. International Journal of Advanced Computer Science & Applications, 7(6), 259-268. https://europub.co.uk./articles/-A-154286