Artículos de revistas
Improving the Robustness of Distributed Failure Detectors in Adverse Conditions
Fecha
2012Registro en:
IEEE LATIN AMERICA TRANSACTIONS, PISCATAWAY, v. 10, n. 1, supl. 1, Part 1, pp. 1364-1369, JAN, 2012
1548-0992
10.1109/TLA.2012.6142485
Autor
Lemos, Fernando Tarla Cardoso
Sato, Liria Matsumoto
Institución
Resumen
Failure detection is at the core of most fault tolerance strategies, but it often depends on reliable communication. We present new algorithms for failure detectors which are appropriate as components of a fault tolerance system that can be deployed in situations of adverse network conditions (such as loosely connected and administered computing grids). It packs redundancy into heartbeat messages, thereby improving on the robustness of the traditional protocols. Results from experimental tests conducted in a simulated environment with adverse network conditions show significant improvement over existing solutions.