Practical Byzantine Fault Tolerance (PBFT)


What Is Practical Byzantine Fault Tolerance (PBFT)? 

Practical Byzantine Fault Tolerance (PBFT) is a consensus algorithm designed to work efficiently in an asynchronous system (no assumptions are made about timing) and optimised to have low latency. 

First introduced by Miguel Castro and Barbara Liskov in a 1999 study, the PBFT algorithm is a form of state machine replication, which means a system state is replicated and distributed across multiple servers. This ultimately makes the system more robust because failures like Byzantine faults can occur in isolation, as opposed to causing the failure of an entire network.

How the PBFT Algorithm Works

Fundamentally, nodes in a PBFT model are ordered in a sequence where one node is regarded as the primary node (leader) and other nodes are referred to as backups. However, all nodes within a network still maintain communication. The goal is for all honest nodes to come to an agreement regarding the state of the system through a majority rule. 

For the PBFT model to function, the maximum number of malicious nodes must not be greater than or equal to one-third of all the nodes in the system. Each round of PBFT consensus (commonly referred to as ‘views’) is summarised into four simple phases: 

  1. A client sends a request for a service operation to the primary node (leader).

  1. The primary node distributes this request to all the backup replicas at the same time.

  1. The replicas execute the request and send a reply to the client.

  1. The client waits for replies from different replicas with similar results.

The primary node changes every view, which can be replaced with a ‘view change’ protocol if the request is not broadcast after a certain period. Additionally, if a primary node is found to be faulty, it can be removed with the agreement of a large majority of honest nodes.

Examples of platforms using PBFT variants are Zilliqa, Hyperledger, and Tendermint.

Key Takeaway

Practical Byzantine Fault Tolerance (PBFT) is an algorithm that prevents Byzantine faults in asynchronous environments and optimises for low latency.

Related Words