Document Image Binarization Using Independent Component Analysis For OCR

Abstract

 The Image binarization plays a vital role in text segmentation which is used in OCR application. Binarization of text in degraded images is a challenging task due to the variations in size, color and font of the text and the results be often affected by complex backgrounds, dissimilar lighting conditions, reflections and shadow. A robust solution to this problem can significantly enhance the precision of scene text recognition algorithms leading to a variety of applications such as scene understanding, navigation, automatic localization and image retrieval. In this paper, we propose a novel method to extract and binarize text as of images that contains complex background. We apply an Independent Component Analysis (ICA) based technique to map out the text region, which is uniform in nature, while removing specularity, shadows and reflections, which are included in the background. This algorithm works better on images with different degradations. We implement our method on various DIBCO datasets.

Authors and Affiliations

Varada Sreeja

Keywords

Related Articles

COLLEGE RECORD ANALYSIS AND MONITORING

This section introduces Department Data Management System. This project focus on the maintenance and manipulation of college data in sorted way i.e. department-wise. So one can easily maintain and get all information o...

 POOR QUALITY IN BUILDING PROJECTS

 The cost of quality are cost associated with the prevention, discovery, and resolving of defects. These costs can arise whether the product is in design stages, manufacturing plants, or in customer’s hand. It...

 A Novel FP-Tree Algorithm for Large XML Data Mining

 The goal of data mining is to extract or mine" knowledge from popular for representing semi structured data and a standard for data exchange over the web. Mining XML data from the web is becoming increasingly imp...

 Palm Biometrics: Testimony of Security

 Now a days, whenever we think of developing a system or software, the first thing that comes in one’s mind is Security. We always think whether the given system is secure enough to use or not? This question arises...

 STUDY OF MAXIMUM POWER POINT TRACKING(MPPT) IN SOLAR PV ARRAY SYSTEM

 The problem being solved using maximum power point tracking MPPT techniques is to find the voltage VMPP or current IMPP at which a photovoltaic module should operate to obtain the maximum power output PMAX under a...

Download PDF file
  • EP ID EP158692
  • DOI -
  • Views 85
  • Downloads 0

How To Cite

Varada Sreeja (30).  Document Image Binarization Using Independent Component Analysis For OCR. International Journal of Engineering Sciences & Research Technology, 3(9), 161-166. https://europub.co.uk./articles/-A-158692