Semi-Structured Data Structured Data Conversion Using Data Mining Methods
Journal Title: International journal of Emerging Trends in Science and Technology - Year 2017, Vol 4, Issue 10
Abstract
Emerging technologies of semi-structured data have attracted a wide attention like networks, e-commerce, information retrieval and databases. In these applications, the data are modeled not as static collections but as transient data streams, where the data source is an unbounded stream of individual data items. It is becoming increasingly popular to send heterogeneous and ill-structured data through networks. Since traditional database technologies are not directly applicable to such data streams, it is important to study efficient information extraction methods for semi-structured data. Hence there has been increasing demand for automatic methods for extracting useful information, particularly, for discovering rules or patterns from large collection of semi-structured data, namely, semi-structured data mining. We introduce a class of simple combinatorial patterns over texts such as proximity phrase association patterns and ordered and unordered tree patterns modeling unstructured texts and semi-structured data on the Web. In addition with, we consider the problem of finding the patterns that optimize a given statistical measure within the whole class of patterns in a large collection of unstructured texts. For these classes of patterns, we develop fast and robust text mining algorithms based on techniques in computational geometry, string matching, and combinatorial optimization. We successfully implemented the developed text and semi-structured mining algorithms with experiments on interactive document browsing in a large text database, keyword and common structure discovery from Web.
Authors and Affiliations
B. Suchitra
Bullying in the Perspective of the Inconvenience Against the Process of Interaction in School
Bullying And Its Impact On Children's Behavior At School (Study At SMkN In Kendari) This type of research is a qualitative phenomenological approach via the data source in this research is the primary Data and Secondary...
A Saliency Detection Model Based on Wavelet Transform Through Fusion of Color Spaces
Visual attention is studied by detecting a salient object in an input image. Visual attention is used in various image processing applications such as image segmentation, patch rarities, pattern recognition etc. In this...
Detection and Isolation of Multiple Black Hole Attack Using Modified DSR
Mobile Ad hoc Network is a robust infrastructure less wireless network. It can be formed either by mobile nodes or by both fixed and mobile nodes. Nodes are randomly connected with each other and forming arbitrary topolo...
MANET Routing Protocols with CBR and TCP Traffic using Improved Reference Point Group Mobility (iRPGM) Model
A mobile ad hoc network (MANET) is a collection of wireless mobile nodes forming a dynamic network Topology without the aid of any existing network infrastructure or centralized administration. Each node participating in...
Wireless Networks
The arrival of wireless technology has reduced the human efforts for accessing data at various locations by replacing wired infrastructure with wireless infrastructure and also providing access to devices having mobility...