GRID COMPUTING AND FAULT TOLERANCE APPROACH
Journal Title: International Journal of Computer Science and Management Studies (IJCSMS) www.ijcsms.com - Year 2011, Vol 11, Issue 3
Abstract
Grid computing is a means of allocating the computational power of a large number of computers to complex difficult computation or problem. Grid computing is a distributed computing paradigm that differs from traditional distributed computing in that it is aimed toward large scale systems that even span organizational boundaries. This paper proposes a method to achieve maximum fault tolerance in the Grid environment system by using Reliability consideration by using Replication approach and Check-point approach. Fault tolerance is an important property for large scale computational grid systems, where geographically distributed nodes co-operate to execute a task. In order to achieve high level of reliability and availability, the grid infrastructure should be a foolproof fault tolerant. Since the failure of resources affects job execution fatally, fault tolerance service is essential to satisfy QOS requirement in grid computing. Commonly utilized techniques for providing fault tolerance are job check pointing and replication. Both techniques mitigate the amount of work lost due to changing system availability but can introduce significant runtime overhead. The latter largely depends on the length of check pointing interval and the chosen number of replicas, respectively. In case of complex scientific workflows where tasks can execute in well defined order reliability is another biggest challenge because of the unreliable nature of the grid resources.
Authors and Affiliations
Pankaj Gupta
Security Processing for High End Embedded System with Cryptographic Algorithms
This paper is intended to introduce embedded system designers and design tool developers to the challenges involved in designing secure embedded systems. The challenges unique to embedded systems require new approaches t...
Data warehousing and Phases used in Internet Mining
In this paper, we describe the data warehousing and data mining. Data Warehousing is the process of storing the data on large scale and Data mining is the process of analyzing data from different perspectives and summar...
Wireless ad-hoc networks Broadcasting Protocol And Routing Characteristics
The mobile ad hoc network (MANET) has recently been recognized as an attractive network architecture for wireless communication. Reliable broadcast is an important operation in MANET (e.g., giving orders, searching route...
GENETIC A LGORITHM FOR MULTIPROCESSOR TASK SCHEDULING
Multiprocessor task scheduling (MPTS) is an important and computationally difficult problem. Multiprocessors have emerged as a powerful computing means for running real-time applications especially due to limitation of...