Classifying Arabic Text Using KNN Classifier
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2016, Vol 7, Issue 6
Abstract
With the tremendous amount of electronic documents available, there is a great need to classify documents automatically. Classification is the task of assigning objects (images, text documents, etc.) to one of several predefined categories. The selection of important terms is vital to classifier performance, feature set reduction techniques such as stop word removal, stemming and term threshold were used in this paper. Three term-selection techniques are used on a corpus of 1000 documents that fall in five categories. A comparison study is performed to find the effect of using full-word, stem, and the root term indexing methods. K-nearest – neighbors classifiers used in this study. The averages of all folds for Recall, Precision, Fallout, and Error-Rate were calculated. The results of the experiments carried out on the dataset show the importance of using k-fold testing since it presents the variations of averages of recall, precision, fallout, and error rate for each category over the 10-fold.
Authors and Affiliations
Amer Al-Badarenah, Emad Al-Shawakfa, Khaleel Al-Rababah, Safwan Shatnawi, Basel Bani-Ismail
A new Hierarchical Group Key Management based on Clustering Scheme for Mobile Ad Hoc Networks
The migration from wired network to wireless network has been a global trend in the past few decades because they provide anytime-anywhere networking services. The wireless networks are rapidly deployed in the future, se...
The Role of Technical Analysis Indicators over Equity Market (NOMU) with R Programing Language
The stock market is a potent, fickle and fast-changing domain. Unanticipated market occurrences and unstructured financial information complicate predicting future market responses. A tool that continues to be advantageo...
An Efficient Method for Distributing Animated Slides of Web Presentations
Attention control of audience is required for suc-cessful presentations, therefore giving a presentation with im-mediate reaction, called reactive presentation, to unexpected changes in the context given by the audience...
Improved Langley and Ratio Langley Methods for Improving Sky-Radiometer Accuracy
Improved Langley Method (ILM) is proposed to improve the calibration accuracy of the sky-radiometer. The ILM uses that the calibration coefficients of other arbitrary wavelengths can be presumed from the calibration coef...
Data Flow Sequences: A Revision of Data Flow Diagrams for Modelling Applications using XML
Data Flow Diagrams were developed in the 1970’s as a method of modelling data flow when developing information systems. While DFDs are still being used, the modern web-based which is client-server based means that DFDs a...