Consensus Series | The Byzantine Generals Problem
The blockchain industry has been benefiting from those people who are thinking about and exploring in two directons: some are imagining and expecting more secure and reliable DAPPs to emerge on the blockchain, giving unparalleled freedom and speed to the flow of digital assets; others are deeply thinking about the consensus mechanism under the blockchain, examining the most fundamental principles of human collaborative activities, in order to find the most direct and credible path to consensus and change the model of human organizations.
Poly Network is a project that enables the transfer of assets between heterogeneous blockchains, and we study consensus mechanisms to ensure the security of assets across different blockchains. We will write a series of articles to introduce the mainstream consensus algorithms and explain as clearly as possible the logic of their schema and the principles of game theory behind them. After carefully comparing these algorithms, Poly Network will be able to gain inspiration from the researches and find a consensus mechanism that better matches its own growth requirements.
The following content is the first article of this consensus series, so let’s start with the basic dilemma of distributed systems.
CAP Trilemma
In the early days of computer’s history, the excellent computing performance of mainframes attracted the interest of government, finance and telecom departments. However, if the mainframe failed to work, the whole system would be unavailable and the business and operations would be paralyzed. With the emergence of reliable and stable communication networks, multi-machine systems have gradually become a reality. With different machines cooperating with each other and backing each other up to form a distributed system, the aforementioned “single point of failure” problem would be circumvented. However, distributed systems also have their own challenges to deal with.
We define a distributed system as one in which hardware or software components located at networked computers communicate and coordinate their actions only by passing messages.
— Distributed Systems, George Coulouris, et al.
Before the inventation of Bitcoin, computer scientists had been carefully studying distributed systems for decades and proved the CAP impossibility theorem, which states that it is impossible for a distributed computing system to satisfy the following three properties at the same time:
- Consistency: all clients access the same copy of the most up-to-date data
- Aailability: the client gets a response to every request, but there is no guarantee that the received data is up-to-date
- Partition tolerance: the system fails to achieve data consistency within the time limit, then partitioning occurs, meaning that the received data from different partitions is not consistent
Let’s set up the simplest distributed system model to further explain, which contains two nodes A and B. Suppose that the databases of nodes A and B are identical at some time point, and after that time point client C accesses node A and initiates a write operation to the database of A . Thus, the databases of nodes A and B are no longer identical, which means that the system is partitioned (partition tolerance is satisfied). Of the two remaining properties, at most one more can be satisfied: to ensure consistency, node B can no longer be available(off-line); if availability is to be guaranteed, data consistency is automatically lost.
The above picture is the famous CAP Trilemma, which, like the second law of thermodynamics, causes despair among designers of distributed systems, and today the blockchain industry also suffers from it.
Byzantine Generals Problem
Two decades before the CAP theorem was proved, in 1982, computer scientist Lamport proposed the Byzentine Generals Problem to solve data consistency in distributed systems.
The enemy city was surrounded by several Byzantine armies, each commanded by a general. They relied on fast-moving messengers to deliver messages in order to reach a unanimous action, i.e., attack or withdraw. The final action could be decided if and only if more than half of the generals agreed.
At first glance, it seems to be a very simple problem. But in fact the devil is hidden inside the details: whether the messenger can deliver the action command to target in time; whether the message has been tampered with by the messenger; and the most serious concern is whether there would be a traitor inside the generals who will send a wrong action. All of these realistic concerns can affect the consistency process and thus lead to the failure or success of the operation. With the low latency and stability of modern communication systems, the first problem was beautifully solved, and the invention of cryptographic algorithms and digital signatures by cryptographers elegantly solved the second problem. And the third problem is essentially one of trust, and is the puzzle that blockchain is trying to solve for the world today.
The Byzantine generals needed a consensus algorithm to ensure these propositions:
A. all loyal generals eventually take the same action
B. Traitors do not mislead loyal generals into taking a wrong action
To better describe the consensus process, the following notation is agreed upon
To ensure that propositions A and B hold, the consensus algorithm must satisfy
Condition 1. each loyal general must get the same set of messages V
Condition 2. If general ⅰ is loyal, then each loyal general’s message set V must contain the same correct υⅰ
In the Byzantine model, the minimum number of loyal generals is 3. As will be shown, as long as there is one traitor, the remaining two generals cannot agree to act. As shown in the figure below, three Byzantine generals surround the enemy city, but general G3 becomes a traitor, intending to interfere with the actions of the other two generals.
If there are now 3m generals, a third of them, m traitors, will interfere with the remaining two-thirds, or 2m generals, from agreeing to act as soon as they conspire to disrupt. All that is needed at this point is one more loyal general to help complete the consensus process.
This is the model proposed by Lamport for solving data consistency: in a distributed system, the maximum number of malicious nodes that can be tolerated must be less than one-third of the total for consensus to be finalized among honest nodes, otherwise there is no solution.
Coming Next
After Byzantine model has been proposed, computer scientists have continued to design different algorithms for this model to ensure that systems can reach consensus efficiently and credibly until Practical Byzantine Fault Tolerance(pBFT) has been created, which is the first algorithm that is seen as practical to solve BFT. We will introduce pBFT in the coming article.
Follow us
Website | Telegram | Medium | Twitter | Discord | Github| Forum