Text Separation from Graphics by Analyzing Stroke Width Variety in Persian City Maps

Abstract

Text segmentation is a live research field with vast new areas to be explored. Separating text layer from graphics is a fundamental step to exploit text and graphics information. The language used in the map is a challenging issue in text layer separation problem. All current methods are proposed for non-Persian language maps. In Persian, text strings are composed of one or more subwords. Each subword is also composed of one to several letters connected together. Therefore, the components of the text strings in Persian are more diverse in terms of size and geometric form than in English. Thus, the overlapping of the Persian text and the lines usually produces a complex structure that the existing methods cannot handle with the necessary efficiency. For this purpose, the stroke width variety of the input map is calculated, and then the average line width of graphics is estimated by analyzing the content of stroke width. After finding the average width of graphical lines, we classify the complex structure into text and graphics in pixel level. We evaluate our method on some variety of full crossing text and graphics in Persian maps and show that some promising results in terms of precision and recall (above 80% and 90%, respectively) are obtained.

Authors and Affiliations

Ali Ghafari- Beranghar, Ehsanollah Kabir, Kaveh Kangarloo

Keywords

Related Articles

Antennas of Circular Waveguides

The design of the circular waveguide antenna is proposed for displacement reflector antennas. For them, we use the frequencies of operation so that our waveguide generates the mode, (Transversal Electric), resulting in a...

Arabic Text Classification using Feature-Reduction Techniques for Detecting Violence on Social Media

With the current increase in the number of online users, there has been a concomitant increase in the amount of data shared online. Techniques for discovering knowledge from these data can provide us with valuable inform...

A Multiclass Deep Convolutional Neural Network Classifier for Detection of Common Rice Plant Anomalies

This study examines the use of deep convolutional neural network in the classification of rice plants according to health status based on images of its leaves. A three-class classifier was implemented representing normal...

Communicator for Hearing-Impaired Persons using Pakistan Sign Language (PSL)

Communication with a hearing-impaired individual is a big challenge for a normal person. Hearing-impaired people uses hand gesture language (sign language) to communicate with each other, which is not easy to understand...

Performance Evaluation of Different Data Mining Techniques for Social Media News Credibility Assessment

Social media has recently become a basic source for news consumption and sharing among millions of users. Social media platforms enable users to publish and share their own generated content with little or no restrictio...

Download PDF file
  • EP ID EP322108
  • DOI 10.14569/IJACSA.2018.090632
  • Views 101
  • Downloads 0

How To Cite

Ali Ghafari- Beranghar, Ehsanollah Kabir, Kaveh Kangarloo (2018). Text Separation from Graphics by Analyzing Stroke Width Variety in Persian City Maps. International Journal of Advanced Computer Science & Applications, 9(6), 222-229. https://europub.co.uk./articles/-A-322108