
doi: 10.3897/jucs.120840
Fault tolerance approaches in distributed systems are essentially based on replication and checkpointing. Each of these approaches has its advantages and limitations. This paper has two objectives: first, it proposes a fault tolerance approach based on the nodes status of a distributed system. For this purpose, it defines 3 nodes status: safety, faulty and potentially faulty. With respect of classical node status (safety, faulty), it introduces a new status that we call potentially faulty. This last node allows to enhance the availability of a distributed system. Second, it discusses the efficiency of the proposed model on two types of architectures: virtual multi-node cluster and a physical multi-node cluster with WIFI connection. Experiments have showed that proposed approach increases the system performance throughput and its fault tolerance level.
Distributed Systems, Hadoop, Netw, Electronic computers. Computer science, Fault Tolerance, QA75.5-76.95, Networks, Node Failures
Distributed Systems, Hadoop, Netw, Electronic computers. Computer science, Fault Tolerance, QA75.5-76.95, Networks, Node Failures
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
