Word Extraction Using X-Y Cut Algorithm

Abstract

Digitization of printed documents is the motivating factor today to work more on text of scanned documents. Conversion of hand written scanned or printed documents into electronically readable form enables to store, exchange and process the valuable information. Text Recognition aims to recognize the text from printed or handwritten document to desired format. Several steps of text recognition include preprocessing, segmentation, feature extraction, classification, post processing. Preprocessing refers to the basic conversion operation of gray Scale image into Binary Image and removal of noisy signal from image. Segmentation does the segment the document image into line by line and extracts each word from segmented line. Feature extraction is calculating the characteristics of character. A classification contains the database and further processing of them. The paper proposes the approach to extract words from based on a set of properties for each connected component in the whole binary image of the document which is independent of languages.

Authors and Affiliations

Simple Batra

Keywords

Related Articles

Three Phase AC Double Layer Wave Winding Diagram, a Simplified Method

Winding diagrams are an integral part of electrical machine design. Students of electrical engineering are required to study electrical machine winding diagram as part of their curriculum. DC and AC machines have their o...

Design and Implementation of Experimental Set-up for Property Assessment of NiTiNol Shape Memory Actuator Springs

This study aims to design and implement an experimental set-up to make property assessment of NiTiNol Shape Memory Actuator Springs. Having the right material characteristics for Nickel Titanium based alloy, shortly name...

Opsum: Topic Based Opinion Summarization And Sentiment Analysis

We present a system code-named OpSum for topic-based opinion summarization and sentiment analysis of mobile phone reviews. It enables users to decide whether to purchase or not based on a summary of the reviews for that...

Multi-Hop Routing Cooperative Communication for Improving Throughput in IEEE 802.11x Wireless LAN

In this paper, we propose cooperative communication scheme using multiple relays to improve the throughput of IEEE 802.11x wireless LAN. The proposed scheme performs cooperative communication with a terminal supporting a...

Flash Flood Risk Assessment of the Eastern Coastal Basins in Kuwait Applying MCA

Coping with the water scarcity in the arid and hyper-arid regions requires good management for the flash floods. This requires an accurate estimation for the hazard degrees and floods risk to minimizing the damage, dange...

Download PDF file
  • EP ID EP423561
  • DOI 10.9790/9622-0812010104.
  • Views 150
  • Downloads 0

How To Cite

Simple Batra (2018). Word Extraction Using X-Y Cut Algorithm. International Journal of engineering Research and Applications, 8(12), 1-4. https://europub.co.uk./articles/-A-423561