A Hybrid Approach for Complex Layout Detection of Newspapers in Gurumukhi Script Using Deep Learning
Journal Title: International Journal of Experimental Research and Review - Year 2023, Vol 35, Issue 6
Abstract
Layout analysis is the crucial stage in the recognition system of newspapers. A good layout analysis results in better recognition results. The complexity of newspaper layout structures poses a formidable challenge in digitization. The intricate arrangement of text, images, and various sections within a newspaper demands sophisticated algorithms and techniques for accurate layout detection. The paper introduces a diverse set of methodologies from existing literature, highlighting the evolution of techniques for newspaper layout analysis. In this paper, we present a novel method to detect the complex layout of newspapers in the Gurumukhi script by using a hybrid approach. The method developed consists of two parts. In the first part, we proposed an algorithm to remove pictures and graphics from Punjabi newspaper images that involve various image preprocessing tasks based on binarization, finding contours, and erosion on the image to remove the graphics from the image. This method removes pictures from complex non-Manhattan layouts. We have tested this algorithm on 100 newspaper images, giving an accuracy of 96.22%. In the second part, a dataset of 500 newspapers was created with images labeled with five classes on which the model was trained. Finally, we have trained the deep-leaning model based on a convolutional network to detect the columns of text in newspapers. We have used four different architectures of CNN and compared their performance based on different metrics such as precision, recall, and F1 score. We have tested this method on a number of newspapers in the Gurumukhi script. We have achieved an accuracy of 95.53% with this approach.
Authors and Affiliations
Atul Kumar, Gurpreet Singh Lehal
Online Employees Communication Behaviours During Crisis in Jordanian Public Hospitals: The Value of Internal Communication Practices
The present study examined symmetrical communication (SC) and transparent communication (TC) by studying crisis employee-organization relationships (EOR) and their causes and effects. The study utilised relationship mana...
Demographic inequality among the tribal and non-tribal community in Nasik district of Maharashtra State
Demography of tribal people cannot materialize huge in India’s overall demographic status; demographic structures in tribal peoples have often been distinct and distinguished both in historical and comparative outlooks....
Enhancing Liver Disease Detection and Management with Advanced Machine Learning Models
The prevalence of hearing loss has risen making it a significant public health issue. Hearing loss is caused by complicated pathophysiological pathways, with various risk factors identified, such as hereditary factors, i...
Nutritional status and haemoglobin level among adult Bengalee women in a sub-urban area in West Bengal
Nutritional status measured by anthropometry has been a reliable indicator of individual as well as population health. It is associated with morbidities, reduced activity and fitness, impaired cognitive development and a...
Study on the toxicity of neem (Azadirachta indica A. Juss) leaf extracts as phytopiscicide on three life stages of Mozambique tilapia (Oreochromis mossambicus Peters) with special reference to their ethological responses
Acute toxicity of leaf extracts of neem (Azadirachta indica A. Juss) on three different stages of fresh water weed fish Oreochromis mossambicus was investigated in the present study. The 24, 48, 72 and 96 h LC50 values f...