Proposing a Keyword Extraction Scheme based on Standard Deviation, Frequency and Conceptual Relation of the Words
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2017, Vol 8, Issue 4
Abstract
At each text there are a few keywords which provide important information about the content of that text. Since this limited set of words (keywords) is supposed to describe the total concept of a text (e.g. article, book), the correct choosing of keywords for a text plays an important role in the right representing of that text. Despite several efforts in this field, none of the so far published methods is accurate enough to elicit representative words for retrieving a vast variety of different texts. In this study, an unsupervised scheme is proposed which is independent on domain, language, structure and length of a text. The proposed method uses the words’ frequency in conjunction with standard deviation of occurred location of words in text along with considering the conceptual relation of words. In the next stage, a secondary score is given to those selected keywords by the statistical criterion of TFISF in order to improve the basis method of TFIDF. Moreover, the proposed hybrid method does not remove the stopwords since they might be a part of bigram keywords while the similar approaches remove all stopwords at their first stage. Experimental results on the known SEMEVAL dataset imply the superiority of the proposed method in comparison with state-of-the-art schemes in terms of F-score and accuracy. Therefore, the introduced hybrid method can be considered as an alternative scheme for accurate keyword extraction.
Authors and Affiliations
Shadi Masaeli, Seyed Mostafa Fakhrahmad, Reza Boostani, Betsabeh Tanoori
Building a Robust Client-Side Protection Against Cross Site Request Forgery
In recent years, the web has been an indispensable part of business all over the world and web browsers have become the backbones of today's systems and applications. Unfortunately, the number of web application attacks...
A Novel Architecture for Network Coded Electronic Health Record Storage System
The use of network coding for large scale content distribution improves download time. This is demonstrated in this work by the use of network coded Electronic Health Record Storage System (EHR-SS). A Novel Architecture...
An IoT Middleware Framework for Industrial Applications
Starting from the RFID and the wireless sensor networks, the Internet of connected things has attracted the attention of major IT companies and later, of the industrial environment that recognized the concept as one of t...
A System Framework for Smart Class System to Boost Education and Management
The large number of reasonably priced computers, Internet broadband connectivity and rich education content has created a global phenomenon by which information and communication technology (ICT) has used to remodel educ...
Intelligent Transportation System (ITS) for Smart-Cities using Mamdani Fuzzy Inference System
It is estimated that more than half of the world population lives in cities according to (UN forecasts, 2014), so cities are vital. Cities, as we all know facing with complex challenges – for smart cities the outdated tr...