Ethereum's Evolving Architecture
Ethereum infrastructure is constantly evolving. The big change in 2022 was the move from Proof-of-Work (PoW) to Proof-of-Stake (PoS) consensus. This upgrade also came with a big architectural shift: from monolith to modular. The direction of modular architecture starts decoupling responsibilities across various parts of the network, creating new roles and responsibilities. Eventually, specialization is meant to allow network to scale up and operate more efficiently, by decoupling hardware and storage requirements for different actors. With role-based network we now have new opportunities and responsibilities for MEV pools, Layer-2 aggregators, EVM execution network, and consensus network. But this also poses challenges and complications around decentralization and security.
Let’s dive into the two most relevant modules responsible for core blockchain functionality: the execution layer and the consensus layer.
If you are unfamiliar with Blockchain's architecture I have another post explaining some of the basics:
Overview
Blockchains protocols like Ethereum enable Byzantine-fault-tolerant distributed and decentralized systems. These systems aim to be politically decentralized but logically centralized so that they give consistent results. The byzantine formulation means there is no one overall view, or God-mode of absolute state of the system. Moreover, any node in the system may behave out-of-protocol (called byzantine faults). Some examples include delaying or reordering messages, lying, sending contradictory messages to different peers, being unresponsive, etc.
Ethereum blockchain aims to provide logical consistency by agreeing on a single history of transactions by building ‘consensus’ - providing reliability over unreliable components. The CAP theorem is a famous result that states that no system can be designed to provide all three of - consistency, availability, and partition tolerance - thus, tradeoffs have to be made.
In the near term Ethereum prioritizes liveliness, meaning system will continue to accept transactions, through a consensus protocol termed LMD Ghost. In the long term it also provides safety, meaning that system can’t revert a committed transaction, through a bolt-on consensus mechanics called Casper FFG - we’ll talk about these a bit later.
Ethereum protocol describes a network of individual nodes, each one acting independently, communicating over asynchronous and unreliable internet. In the following sections we will get into the basic components that provide liveliness and safety properties to the network; but will not be discussing built-in incentive mechanism in this post. We assume the reader has some familiarity with blockchain based systems.
The Nuts and Bolts
Putting it together: A Consensus protocol, followed by validators, regulated by epoch time interval, uses validator votes to commit blocks of transactions with guarantees backed by staked economic finality.
Validators and Client
A client is software running Ethereum protocol. There are two types of clients: execution and consensus, working in tandem to participate in the ethereum network.
In Bitcoin’s PoW model the term for network participants is “miners”, whereas in Ethereum it is “validators”. In PoS modular Ethereum the concept of a validator is a bit more abstract, and it is in fact a participating entity*.* The first step is to stake Eth in order to participate in securing the network and earn rewards. This is done by a Staker (wallet address). For every 32 eth that is staked, a unique validator entity is eligible to participate. Each validator goes through some hoops to be voted-in by rest of the network to be activated. Once activated, it participates in execution and consensus responsibilities and earns rewards until it voluntarily exits or is removed by her peers.
Validators run on a client. The consensus client maintains a Beacon Chain - a blockchain of consensus state. A client can host multiple validator entities, and in most cases, the validator is participating through both execution and consensus client.
Given the byzantine (BFT) nature of the protocol no two validator participants trust each other, even if they are activated by the same staker. This means an honest validator can assume, at any given point of time, that another peer can be compromised. The only trusted communication channel it has is between execution and consensus client for the same validator instance - secure RPC call that is not under byzantine assumptions. In other words, the concept of a validator entity spans across execution and consensus client.
Execution Layer
The execution layer is responsible for maintaining the BFT state-machine. Its primary job is to run transactions through EVM and update the local state. The client has two main functions: proposing a new block by selecting and running transactions, and signing attestations (confirmations) that a proposed block is valid.
In order to run a transaction it must keep a state that it believes belongs to the consensus chain-head block. Additionally, it keeps a tree of additional blocks that may not be finalized, and can potentially be the consensus chain with some non-negligible probability.
What it does NOT maintain anymore is the blockchain itself. Execution client knows all about blocks, transactions, and account balances. It does not need to maintain information around consensus chain itself - which blocks are being brought into consensus, reorgs, or any other coordination responsibilities. That belongs to the consensus client.
This is one of the reason why querying data out of ethereum execution client is insufficient in understanding committed transactions.
Consensus (Beacon) Layer
The consensus layer reserves orchestration responsibilities. This client maintains a registry of validator addresses, its state, beacon node blockchain, and aggregate attestation signatures. Its functionality extends to consensus running algorithm (GHOST and CASPER), selecting validator committees, facilitating economic functions of staking and slashing, and making sure the blockchain makes progress by moving ahead and finalizing old blocks.
Chain Mechanics
Slots and Epochs
Ethereum’s new PoS consensus protocol runs in epochs of 6m and 24s. Each epoch is broken into 32 slots of 12 seconds each. The concepts of slots and epoch are virtual constructs for synchronization, provide a cadence to protocol’s operations, and spread out the work of attestations.
Each slot is an alignment boundary where blocks may be added. Some slots may be left empty if the network was not able to build and attest (confirm) within the 12 second timeframe. (slot 33 in the diagram below)
At every epoch validators are pseurandomly selected with one of two roles: propose (build) a block, or attest (confirm ad vote) on the validity of a proposed block, which is weighted proportional to a validator’s balance. Additionally, validators police each other around certain infringements (conflicting votes, multiple blocks, etc).
The primary function of the epoch is to spread out the workload of handling attestations (vote) among subsets of validators.
Committees
Historically, in Ethereum just as in Bitcoin, every validator (miner) participated in validating every block. This provided security in numbers. In order to scale further, Ethereum’s architecture adopted a sharded approach, where a subset of validators are responsible for confirming each block. Sharding validators into smaller groups significantly reduces the number of messages that need to be passed around.
There are two types of committees: a) attestation committee, and b) sync committee.
An attestation committee is a group of validators assigned to each slot in the epoch. Since there are 32 slots, there will be 32 groups of committees, with at least 128 validators in each committee.
At every epoch, validators are pseudo-randomly shuffled and assigned to slot committees. For a given slot one validator will be selected as the block proposer, and rest as attesters. Note that all of these assignments and operations are messages broadcasted through its networking layer, cryptographically signed by the validator, and still holds byzantine property - that honest majority can’t be fooled. Each attestation is a two-part vote: one for the committee’s proposed block, and one for the head of the Beacon Chain.
Consensus
A consensus protocol is the process by which the network agrees on the ordering of transactions. The first algorithm called Practical Byzantine Fault Tolerance (PBFT) was published in 1999 by Liskov and Castro.
PBFT prioritizes “safety” and does not have forks. Nakamoto consensus in Bitcoin prioritized liveliness over safety, making it fork-ful. Forks, or branches, occur when the network has not reached strong consensus but still chooses to build blocks (soft commit transactions), hoping that one of the branches will eventually be finalized by a super majority of the network. Even under honest participants branches occur due to latency delays of the network or bugs in client implementations. Remember that blockchains have no god-mode of overall state at any point in time. Updates from each validator needs to be gossiped throughout the network, which takes time. Shorter the block formation time window, higher the likelihood of branches.
Branches of a block chain tree:
A Fork-Choice-Rule is a function GetHead*(Store) → Head* that given the local branch state, gives the head block. In the diagram above, the fork-choice resolves to block g. As the validator receives new votes it must run fork-choice-rule of a given protocol, which may lead to reorganizing and reverting to a different branch as the head to build on.
Ethereum’s balance of liveliness and safety is designed via a combination of two consensus protocols: LMD Ghost and Casper FFG, sometimes called Gasper. During each epoch a validator will make 2 distinct votes: an LMD Ghost vote and a Casper FFG vote.
GHOST Consensus
GHOST stands for “Greediest Heaviest Observed SubTree”, which is a similar concept to longest-chain in nakamoto consensus. LMD is a refinement and stands for Last Message Delivered. The LMD-GHOST algorithm is recursively selects the branch that has the highest cumulative stake voting for it as the canonical chain. The intuition is that building on a specific branch is a vote for all parents in that branch. This process occurs continuously as new attestations are made for each slot.
These parts of the local Store are relevant to the fork choice calculation:
- Block tree with block headers
- List of valid attestation messages from validators
- Validator’s balances, adjusted for each branch choice
Validators makes attestation once per epoch, with its vote for best head block after running fork-choice-rule over its local data. In attestation object, the GHOST vote is beacon_block_root field.
class AttestationData(Container):
slot: Slot
index: CommitteeIndex
beacon_block_root: Root
...
The fork choice rule as explained in a post.
While we won’t go much into incentive system here, the GHOST attestations are regulated via a set of attestation rewards as well as slashing penalties for proposer or attester misbehavior. Determining misbehavior usually requires knowing the ground-truth, which we don’t have (remember, there is no god-mode available). So the Ethereum protocol defines validator judging misbehavior if they receive two contradictory messages from the same peer validator. The process of equivocation is quite detailed but the core principle is that each validator judges its peers.
Casper Consensus
Ethereum’s commit is a two-phase commit protocol. Justification and Finality. While Ghost attestations provide Justification, the first phase, Casper provides Finality, the second phase of the commit. Finalization ensures a block can not be reverted when over 2/3 of the stake are controlled by honest validators. On top of this, it offers a guarantee called accountable safety (economic finality), for when less than 2/3 of the validators are honest (> 1/3 dishonest).
A well-known result in theory is that a BFT protocol on asynchronous network can tolerate up to 1/3 of adversarial nodes.
- With *n* validators and *f* of which may be adversarial or faulty
- To keep liveliness we need to make a decision after hearing from *n-f* validators. But f could not just be adversarial witholding votes, it could be honest ones whose messages haven’t reached.
- So up to f out of *$n-f$* will need to be accounted for in reaching a majority
- To guarantee majority of honest validators we require: $(n-f)/2 > f$, i.e $n > 3f$
Voting
From our Ghost discussion we learnt that voting is spread out throughout the epoch. For efficiency reasons both Ghost and Casper votes are bundled together in the attestation at an assigned epoch. Since validators vote during different slots, they all need to align on a common block to vote for. This is called a “checkpoint”. Usually the first slot (block) in an epoch also serves as a checkpoint.
A checkpoint simply refers to epoch and block root, usually the first block of the epoch
class Checkpoint(Container):
epoch: Epoch
root: Root
A Casper FFG vote is cast alongside Ghost in the attestation object:
class AttestationData(Container):
slot: Slot
index: CommitteeIndex
beacon_block_root: Root
# Casper Vote
source: Checkpoint # Phase 2
target: Checkpoint # Phase 1
Casper FFG is a 2-phase commit protocol. Phase 1 exchanges vote for the checkpoint to be finalized. Phase 2 confirms that the validator received at least 2/3 votes on a given checkpoint from its peer validators.
Phase 1 —> Justification
- Broadcast what it thinks is the best checkpoint
- Listen to network of validators of their votes
- If 2/3 of validators agree then justify the checkpoint
Phase 2 —> Finalization
- Broadcast your justified checkpoint
- Listen to network of validators for their justified checkpoints
- If 2/3 of validators agree then finalize the checkpoint
At the start of epoch N we have justified checkpoint N-1 and finalized checkpoint N-2. So it takes 12.8 minutes to finalize a checkpoint by Ethereum protocol.
It is important to consider this finalization time when ingesting and making decisions on on-chain transaction activity.
Serialization, Merkelization and Hash Tree Roots
You may have noticed that there is a need to store or compare objects at various points the the protocol. The block root hash in the beacon chain, for example, refers to an execution block header, which in turn, refers to state and transactions data. How do we reference a large object compactly? How do ethereum nodes know that they have a shared view of the state information?
The block header stores transactions, state, and receipts root’s merkelized hash.
We need a digest of the state that no one can fake: a cryptographic hash function. Moreover, we want it to be fast performance on membership tests. This is what Merkle trees are great for. Ethereum uses a variant - a hexary Patricia Merkle Tree.
SSZ A Simple Serialization Protocol
Ethereum objects need to be serialized to bytes in order to compute hash and merkelize the tree. It defines a format called SSZ which specifies directly how to serialize each object in the ethereum specification.
Merkelization
Finding merkle root of attestations requires recursively merkelizing and serializing leaf node data. Lead node data are split into 32 byte chunks and merkelized in multiple chunks if larger than 32. The patricia merkle tree used requires 4 children, so any branch with less than 3 child nodes is padded with 0.
The root node of attestations is Attestation Root which contains 4 main pieces of information:
- The indices of validators who participated in signature
- Block being attested at a given slot
- Checkpoints being attested for justifying or finalizing
- Aggregate signature of participating validators
Validator Root: Validators are simple indices or IDs of the participating validators
Data Root: Merkle of data contains blocks and checkpoints being attested. These are recursive merklized after serialization and chunking using SSZ algorithm.
Aggregate Signature Root: The merkle data is a single signature of 96 bytes, split into 3 chunks
BLS signature
Ethereum primarily used ECDSA signatures with secp256k1 curve. However, the consensus protocol uses BLS signature scheme, which is also elliptic-curve based scheme. The primary advantage is that BLS signatures can be aggregated.
There are four components to signature scheme: Private Key, Public Key, the Message being signed, and Signature. Signature schemes have the property that they can be signed using the private key but verified using the public key, for instance:
sign(private_key, message) → s
verify(s, public_key, message) → True/False
Clients and Validators are the agents acting to build and update a view of the world (account balance, transactions) they believe to be true. Slots and Epochs are intervals regulating message coordination within the network. Blocks are a batch of transaction updates to the system. Attestations are digitally signed votes for the latest block update made by validators. Financial Staking allows building consistency through votes. Proof-of-Stake replaces PoW’s math puzzle with voting - carrot-or-the-stick mechanics. Just as block batches transactions to reduce operational overhead, Committees batch individual validator votes to reduce network overhead.
The interesting property of BLS scheme is that multiple signatures can be aggregated to form one signature, which can be validated against an “aggregate” of public keys of the signing entities.
Verification of BLS signatures is 10x more expensive compared to ECDSA signatures, however with attestation aggregations this overhead is easily overcome. If the committee is partitioned and attest 2 different blocks, there will be 2 aggregate signatures, which is still far less than the number of validators in the committee. Aggregates also save on space, however in order to compute public key aggregate, the Attestation object must contain list of validators that participated. In fact, only validator indexes are needed as validator list and their public keys are locally known to each validator.
Putting it all together
Now that we have gone through the nuts-and-bolts of how attestations and beacon chain operates, lets summarize the relevant pieces of data in the beacon consensus chain:
In the next post we will see how this consensus information can be used to cryptographically verify on-chain data retrieved from execution nodes.