Title: Visigoth Fault Tolerance
By: Daniel Porto (DI-FCT-UNL)
Host: Computer Systems
Despite recent efforts to make the performance of data center networks and systems more predictable, replication protocols for stateful services have not taken advantage of such efforts; these protocols still make the worst-case assumption of an asynchronous system where all processes or messages can experience an arbitrarily long delay. The alternative assumption of a synchronous system is difficult to guarantee in practice and is not viable. Similarly, the principled approach for tolerating bit flips and data corruption is Byzantine fault tolerance (BFT). Unfortunately, BFT protocols make a worst-case assumption of an adversarial behavior, which is unlikely within the data center and increase the replication requirements. In practice, these services are designed optimistically assuming Crash fault tolerance (CFT) which is unable to capture arbitrary faults that surfaces at the data center scale.
In this talk, I will present a new technique for designing distributed protocols for building reliable stateful services called Visigoth Fault Tolerance (VFT). VFT introduces the Visigoth model, which makes it possible to calibrate the timing assumptions of a system using a threshold of slow processes or messages, and also to distinguish between non-malicious arbitrary faults and correlated attack scenarios. This enables solutions that leverage the characteristics of data center systems, namely their secure environment and predictable performance, in order to allow replicated systems to be more efficient with respect to the utilization of resources than those designed under asynchrony and Byzantine assumptions, while avoiding the need to make a system synchronous, or to restrict failure modes to silent crashes.
Based on a EuroSys'15 paper.
I'm a 4th year PhD Student at Nova University of Lisbon* advised by Rodrigo Rodrigues. My research focuses on designing cost-effective fault tolerance techniques, in order to improve the dependability of cloud based systems.
I've got my Masters degree in distributed computing from the Federal University of Paraíba (UFPB) In João Pessoa-Brazil. Specifically, I worked designing routing protocols for Wireless Mesh Networks. During my Masters, I also worked as an instructor in the ESR/RNP (Network College) in courses on network services, protocols, management, security(cryptography, forensics and intrusion detection) and operating systems. I've started my undergrad studies in the UFPB. During my Bachelors' I've had many internship experiences in very diverse companies and labs. My concluding work was a specification and a software implementation of the reference access terminal of the Brazilian Digital TV System (TAR-SBTVD), done in collaboration with the USP in São Paulo.
*Currently also affiliated to INESC-ID/Lisbon