OLAWSDS:An Online Arabic Web Spam Detection System
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2014, Vol 5, Issue 2
Abstract
For marketing purposes, Some Websites designers and administrators use illegal Search Engine Optimization (SEO) techniques to optimize the ranking of their Web pages and mislead the search engines. Some Arabic Web pages use both content and link features, to increase artificially the rank of their Web pages in the Search Engine Results Pages (SERPs). This study represents an enhancement to previous work in this field. It includes the design and implementation of an online Arabic Web spam detection system, based on algorithms and mathematical foundations, which can detect the Arabic content and link web spam depending on the tree of the spam detection conditions, beside depending on the user’s feedback through a custom Web browser. The users can participate in making the decision about any Web page, through their feedbacks, so they judge if the Arabic Web pages in the browser are relevant for their particular queries or not. The proposed system uses the extracted content and link features from Arabic Web pages to determine whether to label each Web page as a spam or as a non-spam. This system also attempts to learn from the user’s feedback to enhance automatically its performance. Statistical analysis is adopted in this study to evaluate the proposed system. Statistical Package for the Social Sciences (SPSS) software is used to evaluate this new system which considers the users feedbacks as dependent variables, while Arabic content and links features on the other hand are considered independent variables. The statistical analysis with the SPSS is used to apply a variety of tests, such as the test of the analysis of variance (ANOVA). ANOVA is used to show the relationships between the dependent and independent variables in the dataset, which leads to solving problems and building intelligent decisions and results.
Authors and Affiliations
Mohammed Al-Kabi, Heider Wahsheh, Izzat Alsmadi
The Real-Time Research of Optimal Power Flow Calculation in Reduce Active Power Loss Aspects of Power Grid
In order to research how to availably reduce the active power loss value in power grid system when the power system is operating, it offers a quantitative research in theory through conceiving the unbalanced losses of po...
A Copula Statistic for Measuring Nonlinear Dependence with Application to Feature Selection in Machine Learning
Feature selection in machine learning aims to find out the best subset of variables from the input that reduces the computation requirement and improves the predictor performance. In this paper, a new index based on empi...
A Serious Game for Healthcare Industry: Information Security Awareness Training Program for Hospital Universiti Kebangsaan Malaysia
This paper aims to develop an information security awareness training program for the healthcare industry to ensure the appropriate protection of electronic health systems. Serious games are primarily designed for traini...
NB-IoT Pervasive Communications for Renewable Energy Source Monitoring
Renewable sources like solar and wind energy have seen a drastic increase in the market, especially in developing countries where electricity prices are high and QoS and QoE, both are at their lowest. In this paper, we i...
Towards No-Reference of Peak Signal to Noise Ratio
The aim of this work is to define a no-referenced perceptual image quality estimator applying the perceptual concepts of the Chromatic Induction Model The approach consists in comparing the received image, presumably deg...