Cross-Lingual Sentiment Classification from English to Arabic using Machine Translation

Abstract

Cross-lingual sentiment learning is becoming increasingly important due to the multilingual nature of user-generated content on social media and the scarce resources for languages other than English. However, cross-lingual sentiment learning is a challenging task due to the different distribution between translated data and original data and due to the language gap, i.e. each language has its own ways to express sentiments. This work explores the adaptation of English resources for sentiment analysis to a new language, Arabic. The aim is to design a light model for cross-lingual sentiment classification from English to Arabic, without any manual annotation effort which, at the same time, is easy to build and does not require deep linguistic analysis. The ultimate goal is to find an optimal baseline model and to determine the relation between the noise in the translated data and the accuracy of sentiment classification. Different configurations of several factors are investigated including feature representation, feature reduction methods, and the learning algorithms to find the optimal baseline model. Experiments show that a good classification model can be obtained from translated data regardless of the artificial noise added by machine translation. The results also show a significant cost to automation, and thus the best path to future enhancement is through the inclusion of language-specific knowledge and resources.

Authors and Affiliations

Adel Al-Shabi, Aisah Adel, Nazlia Omar, Tareq Al-Moslmi

Keywords

Related Articles

Survey and Classification of Methods for Building a Semantic Annotation

Though Arabic is one of the five most spoken languages, little work has been done on building Arabic semantic resources. Currently, there is no agreed-upon method for building such a reliable Arabic semantic resource. Th...

Rainfall Prediction using Data Mining Techniques: A Systematic Literature Review

Rainfall prediction is one of the challenging tasks in weather forecasting. Accurate and timely rainfall prediction can be very helpful to take effective security measures in advance regarding: ongoing construction proje...

Convenience and Medical Patient Database Benefits and Elasticity for Accessibility Therapy in Different Locations

When a patient comes to a hospital, clinic, physician practices or other clinics, the enrollment section will ask whether the patient in question had never come or not. If the patient in question said he had never come t...

ETEEM- Extended Traffic Aware Energy Efficient MAC Scheme for WSNs

Idle listening issue arises when a sensor node listens to medium despite the absence of data which results in consumption of energy. ETEEM is a variant of Traffic Aware Energy Efficient MAC protocol (TEEM) which focuses...

Design of Linear Phase High Pass FIR Filter using Weight Improved Particle Swarm Optimization

The design of Finite Impulse Response (FIR) digital filter involves multi-parameter optimization, while the traditional gradient-based methods are not effective enough for precise design. The aim of this paper is to pres...

Download PDF file
  • EP ID EP259593
  • DOI 10.14569/IJACSA.2017.081257
  • Views 98
  • Downloads 0

How To Cite

Adel Al-Shabi, Aisah Adel, Nazlia Omar, Tareq Al-Moslmi (2017). Cross-Lingual Sentiment Classification from English to Arabic using Machine Translation. International Journal of Advanced Computer Science & Applications, 8(12), 434-440. https://europub.co.uk./articles/-A-259593