Enabling Scalable Data Processing and Management through Standards-based Job Execution and the Global Federated File System

Journal Title: Scalable Computing: Practice and Experience - Year 2016, Vol 17, Issue 2

Abstract

Emerging challenges for scientific communities are to efficiently process big data obtained by experimentation and computational simulations. Supercomputing architectures are available to support scalable and high performant processing environment, but many of the existing algorithm implementations are still unable to cope with its architectural complexity. One approach is to have innovative technologies that effectively use these resources and also deal with geographically dispersed large datasets. Those technologies should be accessible in a way that data scientists who are running data intensive computations do not have to deal with technical intricacies of the underling execution system. Our work primarily focuses on providing data scientists with transparent access to these resources in order to easily analyze data. Impact of our work is given by describing how we enabled access to multiple high performance computing resources through an open standards-based middleware that takes advantage of a unified data management provided by the the Global Federated File System. Our architectural design and its associated implementation is validated by a usecase that requires massivley parallel DBSCAN outlier detection on a 3D point clouds dataset.

Authors and Affiliations

SHAHBAZ MEMON, MORRIS RIEDEL, SHIRAZ MEMON, CHRIS KOERITZ, ANDREW GRIMSHAW, HELMUT NEUKIRCHEN

Keywords

Related Articles

Introduction to the Special Issue on Distributed Computing with Applications in Bioengineering

Biomedical Engineering, usually known as Bioengineering, is among the fastest developing and one of most important interdisciplinary fields today. It connects natural and technical sciences, for all of which biological a...

Enhanced Data Security for Public Cloud Environment with Secured Hybrid Encryption Authentication Mechanisms

Cloud computing is an evolving computing technology that provides many services such as software and storage. With the introduction of cloud storage, the security of outsourced data has become a major issue in cloud comp...

Exposing HPC services in the Cloud: the CloudLightning Approach

Nowadays we are noticing important changes in the way High Performance Computing (HPC) providers are dealing with the demand. The growing requirements of modern data- and compute-intensive applications ask for new models...

SLA-based Secure Cloud Application Development

The perception of lack of control over resources deployed in the cloud may represent one of the critical factors for an organization to decide to cloudify or not its own services. The flat security features offered by co...

SCALE-EA: A Scalability Aware Performance Tuning Framework for OpenMP Applications

HPC application developers, including OpenMP-based application developers, have stepped forward to endeavor the future design trends of exa-scale machines, such as, increased number of threads/cores, heterogeneous archit...

Download PDF file
  • EP ID EP203662
  • DOI 10.12694/scpe.v17i2.1160
  • Views 24
  • Downloads 0

How To Cite

SHAHBAZ MEMON, MORRIS RIEDEL, SHIRAZ MEMON, CHRIS KOERITZ, ANDREW GRIMSHAW, HELMUT NEUKIRCHEN (2016). Enabling Scalable Data Processing and Management through Standards-based Job Execution and the Global Federated File System. Scalable Computing: Practice and Experience, 17(2), 115-128. https://europub.co.uk./articles/-A-203662