Efficient Construction of Dictionary using Directed Acyclic Word Graph
Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2016, Vol 18, Issue 4
Abstract
Abstract: Implementation of dictionary is a topic on which research is going on since a long time in the search of a better and efficient algorithm both in terms of space and time complexity. Its necessity has increased due to the advent of recent technical advances in the fields of Pattern Recognition, Image Processing, Natural Language Processing and Machine Learning. The matter is of utter importance when it comes to handheld devices where limited memory availability is a crucial factor. From a programmer’s point of view, the earlierand the conventional methods used to implement a dictionary were hashing and suffix trees which subsequently failed in order to provide the desired space efficiency. By far the data structure which has proved to be the best for storing a dictionary has been Directed Acyclic Word Graph (DAWG), which merges the common suffixes inaddition to the common prefixes of the words. The current method used for implementing it, requires construction of an intermediate Trie which puts a lot of memory constraints and is very time inefficient. In this paper, we describe a new method for creating a compressed dictionary with DAWG using Associative Arrays and concepts of Finite Automata on-the-fly as and when a new word is encountered in the lexicographic order. The method presented in the paper doesn’t require construction of an intermediate Trie and hence has a bettertime complexity. The concept of right language has been crucial to derive the algorithm and is mentioned in the paper. The algorithm is well explained with an example. In the end, the time complexity of the algorithm is analyzed and a small comparison is made in the terms of the maximum memory requirement, for different number of words, for both of the above mentioned methods of constructing DAWG at runtime
Authors and Affiliations
Srinivas. H , Amruth. V
Sentiment analysis for improving healthcare system for women
Abstract: The system proposes a feedback mechanism wherein, sentiment analysis is performed from surveys and tweets based on prevailing health issues among adult women in India and the social opinion on prevalenthealth i...
Design and Simulation of Dct Chip In Vhdl and Application in Watermark Extraction
Abstract: The paper presented the design, modeling and chip implementation of 2D Discrete Cosine Transform (DCT) domain for copyright protection of images, as digital watermarking chip. Recent improvement in comput...
Analyzing Process Behavior to Predict Resource Allocation in Distributed Environment by Using Time Series and Online Predictive Approach Algorithm
Abstract:A distributed system is a collection of different computers to handle large amount of data, the connected computers can share and coordinate their data on network. It is very difficult to for server to han...
Informational Retrieval Using Crawler & Protecting Social Networking Data from Information leakage
Abstract: Online social networks, such as Facebook, Twitter, Yahoo!, Google+ are utilized by many people. These networks allow users to publish details about themselves and to connect to their friends. Some of the...
Design Approach to Big data Systems in Developing and Maintaining the Information Security Systems
Abstract: Data is accumulating from almost all aspects of our everyday lives that it becomes huge and multistructuredand has hidden useful information. The challenges with Big Data include capture, curation, storage, sea...