Hierarchical Classifiers for Multi-Way Sentiment Analysis of Arabic Reviews

Abstract

Sentiment Analysis (SA) is one of hottest fields in data mining (DM) and natural language processing (NLP). The goal of SA is to extract the sentiment conveyed in a certain text based on its content. While most current works focus on the simple problem of determining whether the sentiment is positive or negative, Multi-Way Sentiment Analysis (MWSA) focuses on sentiments conveyed through a rating or scoring system (e.g., a 5-star scoring system). In such scoring systems, the sentiments conveyed in two reviews of close scores (such as 4 stars and 5 stars) can be very similar creating an added challenge compared to traditional SA. One intuitive way of handling this challenge is via a divide-and-conquer approach where the MWSA problem is divided into a set of sub-problems allowing the use of customized classifiers to differentiate between reviews of close scores. A hierarchical classification structure can be used with this approach where each node represents a different classification sub-problem and the decision from it may lead to the invocation of another classifier. In this work, we show how the use of this divide-and-conquer hierarchical structure of classifiers can generate better results than the use of existing flat classifiers for the MWSA problem. We focus on the Arabic language for many reasons such as the importance of this language and the scarcity of prior works and available tools for it. To the best of our knowledge, very few papers have been published on MWSA of Arabic reviews. One notable work is that of Ali and Atiya, in which the authors collected a large scale Arabic Book Reviews (LABR) dataset and made it publicly available. Unfortunately, the baseline experiments on this dataset had very low accuracy. We present two different hierarchical structures and compare their accuracies with the flat structure using different core classifiers. The comparison is based on standard accuracy measures such as precision and recall in addition to using the mean squared error (MSE) as a more accurate measure given the fact that not all misclassifications are the same. The results show that, in general, hierarchical classifiers give significant improvements (of more than 50% in certain cases) over flat classifiers.

Authors and Affiliations

Mahmoud Al-Ayyoub, Aya Nuseir, Ghassan Kanaan, Riyad Al-Shalabi

Keywords

Related Articles

Text Clustering using Ensemble Clustering Technique

Clustering is being used in different fields of research, including data mining, taxonomy, document retrieval, image segmentation, pattern classification. Text clustering is a technique through which text/ documents are...

Comparative Analysis and Survey of Ant Colony Optimization based Rule Miners

In this research study, we analyze the performance of bio inspired classification approaches by selecting Ant-Miners (Ant-Miner, cAnt_Miner, cAnt_Miner2 and cAnt_MinerPB) for the discovery of classification rules in term...

Automated Menu Recommendation System Based on Past Preferences

Data mining plays an important role in ecommerce in today’s world. Time is critical when it comes to shopping as options are unlimited and making a choice can be tedious. This study presents an application of data mining...

Reduced Complexity Divide and Conquer Algorithm for Large Scale TSPs

The Traveling Salesman Problem (TSP) is the problem of finding the shortest path passing through all given cities while only passing by each city once and finishing at the same starting city. This problem has NP-hard com...

A Survey of Unstructured Text Summarization Techniques

Due to the explosive amounts of text data being created and organizations increased desire to leverage their data corpora, especially with the availability of Big Data platforms, there is not usually enough time to read...

Download PDF file
  • EP ID EP101370
  • DOI 10.14569/IJACSA.2016.070269
  • Views 96
  • Downloads 0

How To Cite

Mahmoud Al-Ayyoub, Aya Nuseir, Ghassan Kanaan, Riyad Al-Shalabi (2016). Hierarchical Classifiers for Multi-Way Sentiment Analysis of Arabic Reviews. International Journal of Advanced Computer Science & Applications, 7(2), 531-539. https://europub.co.uk./articles/-A-101370