Efficient Preprocessing and Patterns Identification Approach for Text Mining

Journal Title: INTERNATIONAL JOURNAL OF COMPUTER TRENDS & TECHNOLOGY - Year 2013, Vol 6, Issue 2

Abstract

Due to the rapid expansion of digital data , knowledge discovery and data mining have attracted significant amount of a ttention for turning such data into helpful information and knowledge. Text categorization is continuing to become the most researched NLP problems on account of the ever-increasing levels of electronic documents and digital libraries. we present a novel text categorization method that puts together the decision on multiple attributes. Since the most of existing text mining methods adopted term-based approaches, all of these are affected by the difficulties of polysemy and synonymy. Existing pattern discovery technique includes the processes of pattern deploying and pattern evolving, to strengthen the impact of using and updating discovered patterns for looking for relevant and interesting information. But the current association Rules methods exist shortage in two aspects once it is used on patterns classification. a person is the strategy ignored the data about word's frequency in a text . The opposite happens to be the method need pruning rules whenever the mass rules are generated. Within this proposed work specific documents are preprocessed before placing patterns discovery. Preprocessing the document dataset using tokenization, stemming, and probability filtering approaches. Proposed approach gives better decision rules compare to existing approach.

Authors and Affiliations

Pattan Kalesha , M. Babu Rao , Ch. Kavitha

Keywords

Related Articles

Ensemble Classifiers and Their Applications: A Review

Ensemble classifier refers to a group of individual classifiers that are cooperatively trained on data set in a supervised classification problem. In this paper we present a review of commonly used ensemble classifiers i...

Enhancing TCP Performance Over Wireless Networks Using TCP-LBA Techniques

There are many researches in the Networking especially in TCP protocols; when you move on wireless networks then many techniques improve the performances of this TCP protocol. To managing the TCP connection is also a ver...

Image Search Reranking

The existing methods for image search re ranking suffer from the unfaithfulness of the as sumptions under which the text-based images search result. The resulting images contain more irrelevant images. Hence the re ranki...

A Novel Mechanism for Secure and Efficient VANET Communication

A Vehicular Ad hoc NETwork (VANET) is a type of mobile Peer-To-Peer wireless network that allows providing communication among nearby vehicles and between vehicles and nearby fixed roadside equipment. The lack of central...

Meticulous Tasks of Various Cryptographic Techniques in Secure Communications

Our main intention of writing this paper is to provide a understandable knowledge about cryptographic techniques. This paper deals with various Message Authentication Codes, Hash functions and encryption techniques and p...

Download PDF file
  • EP ID EP146823
  • DOI -
  • Views 116
  • Downloads 0

How To Cite

Pattan Kalesha, M. Babu Rao, Ch. Kavitha (2013). Efficient Preprocessing and Patterns Identification Approach for Text Mining. INTERNATIONAL JOURNAL OF COMPUTER TRENDS & TECHNOLOGY, 6(2), 124-129. https://europub.co.uk./articles/-A-146823