PROCESSING IMAGE FILES USING SEQUENCE FILE IN HADOOP

Abstract

 This paper presents MapReduce as a distributed data processing model utilizing open source Hadoop frameworkfor work huge volume of data. The expansive volume of data in the advanced world, especially multimedia data,makes new requirement for processing and storage. As an open source distributed computational framework,Hadoop takes into consideration processing a lot of images on an unbounded arrangement of computing nodes bygiving fundamental foundations. We have lots and lots of small images files and need to remove duplicate filesfrom the available data. As most binary formats—particularly those that are compressed or encrypted—cannot besplit and must be read as a single linear stream of data. Using such files as input to a MapReduce job means thata single mapper will be used to process the entire file, causing a potentially large performance hit. The paperproposes splitable format such as SequenceFile and uses MD5 algorithm to improve the performance of imageprocessing.

Authors and Affiliations

Dr. E. Laxmi Lydia

Keywords

Related Articles

 COMBUSTION PHENOMENON IN CI ENGINE: A REVIEW

 The basic concept of internal combustion engine is a cylinder, which is closed at one end, is filled with a mixture of fuel and air. As the crankshaft turns it pushes cylinder. The piston is forced up and compress...

 An Innovative Cryptographical Scheme for Mobile Ad-Hoc Network Security using Certificate less Public key Cryptography

 A Mobile ad hoc Network (MANET) is a self-governing network comprised of free roaming nodes which communicate wireless by radio transmission. As MANET edge closer toward wide-spread deployment, security issues ha...

 CPW ULTRA-WIDEBAND BANDPASS FILTER USING DEFECTED GROUND

 In this work, we are trying to develop a new filter that have good in-band and out-band and wideband property. The main part of this filter is a parallel-coupled microstrip-coplanar waveguide (CPW),an rectangular...

 Reversible Logic Based Arithmetic and Logic Unit

 Reversible logic has received great attention in the recent years due to its ability to reduce the power dissipation which is the main requirement in low power digital design. It has wide applications in advanced...

 Histogram Based Live Streaming in Peer to Peer Dynamic Balancing & Clustering

 We made a preliminary clustering analysis of an investiga measure research within Planet Lab. The application was an Internet architecture. Our clustering is inspired by the Regularity. Such approach was already...

Download PDF file
  • EP ID EP159707
  • DOI 10.5281/zenodo.160893
  • Views 81
  • Downloads 0

How To Cite

Dr. E. Laxmi Lydia (30).  PROCESSING IMAGE FILES USING SEQUENCE FILE IN HADOOP. International Journal of Engineering Sciences & Research Technology, 5(10), 521-528. https://europub.co.uk./articles/-A-159707