Possible futures of the Ethereum protocol, part 1: The Merge

2024 Oct 14
See all posts

Special thanks to Justin Drake, Hsiao-wei Wang, @antonttc, Anders Elowsson and Francesco for feedback and review.

Originally, "the Merge" referred to the most important event in the
Ethereum protocol's history since its launch: the long-awaited and
hard-earned transition from proof of work to proof of stake. Today,
Ethereum has been a stably running proof of stake system for almost
exactly two years, and this proof of stake has performed remarkably well
in stability,
performance and avoiding
centralization risks. However, there still remain some important
areas in which proof of stake needs to improve.

My roadmap diagram from 2023 separated this out into buckets:
improving technical features such as stability, performance,
and accessibility to smaller validators, and economic changes
to address centralization risks. The former got to take over the heading
for "the Merge", and the latter became part of "the Scourge".

The Merge, 2023 roadmap edition.

This post will focus on the "Merge" part: what can still be
improved in the technical design of proof of stake, and what are some
paths to getting there?

This is not meant as an exhaustive list of things that could be done
to proof of stake; rather, it is a list of ideas that are actively being
considered.

The Merge: key goals

Single slot finality
Transaction confirmation and finalization as fast as possible, while
preserving decentralization
Improve staking viability for solo stakers
Improve robustness
Improve Ethereum's ability to resist and recover from 51% attacks
(including finality reversion, finality blocking, and censorship)

In this chapter

Single slot
finality and staking democratization

What problem are we solving?

Today, it takes 2-3 epochs (~15 min) to finalize a block, and 32 ETH
is required to be a staker. This was originally a compromise meant to balance
between three goals:

Maximizing the number of validators that can
participate in staking (this directly implies minimizing the min
ETH required to stake)
Minimizing the time to finality
Minimizing the overhead of running a node, in this
case the cost of downloading, verifying and re-broadcasting all the
other validator's signatures

The three goals are in conflict: in order for economic finality to be
possible (meaning: an attacker would need to burn a large amount of ETH
to revert a finalized block), you need every single validator to sign
two messages each time finality happens. And so if you have many
validators, either you need a long time to process all their signatures,
or you need very beefy nodes to process all the signatures at the same
time.

Note that this is all conditional on a key goal of Ethereum:
ensuring that even successful attacks have a high cost to the
attacker. This is what is meant by the term "economic
finality". If we did not have this goal, then we could solve this
problem by randomly selecting a committee to finalize each slot. Chains
that do not attempt to achieve economic finality, such as Algorand, often
do exactly this. But the problem with this approach is that if an
attacker does control 51% of validators, then they can perform
an attack (reverting a finalized block, or censoring, or delaying
finality) at very low cost: only the portion of their nodes that are in
the committee could be detected as participating in the attack and
penalized, whether through slashing
or socially-coordinated
soft fork. This means that an attacker could repeatedly attack the
chain many times over, losing only a small portion of their stake during
each attack. Hence, if we want economic finality, a naive
committee-based approach does not work, and it appears at first glance
that we do need the full set of validators to participate.

Ideally, we want to preserve economic finality, while
simultaneously improving on the status quo in two areas:

Finalize blocks in one slot (ideally, keep or even
reduce the current length of 12s), instead of 15 min
Allow validators to stake with 1 ETH (down from 32
ETH)

The first goal is justified by two goals, both of which can be viewed
as "bringing Ethereum's properties in line with those of (more
centralized) performance-focused L1 chains".

First, it ensures that all Ethereum users actually benefit
from the higher level of security assurances achieved through
the finality mechanism. Today, most users do not, because they are not
willing to wait 15 minutes; with single-slot finality, users will see
their transactions finalized almost as soon as they are confirmed.
Second, it simplifies the protocol and surrounding
infrastructure if users and applications don't have to worry
about the possibility of the chain reverting except in the relatively
rare case of an inactivity
leak.

The second goal is justified by a desire to support solo
stakers. Poll after poll repeatedly show that the main factor
preventing more people from solo staking is the 32 ETH minimum. Reducing
the minimum to 1 ETH would solve this issue, to the point where other
concerns become the dominant factor limiting solo staking.

There is a challenge: the goals of faster finality and more
democratized staking both conflict with the goal of minimizing
overhead. And indeed, this fact is the entire reason why we did
not start with single-slot finality to begin with. However, more recent
research presents a few possible paths around the problem.

What is it and how does it
work?

Single-slot finality involves using a consensus algorithm that
finalizes blocks in one slot. This in itself is not a difficult goal:
plenty of algorithms, such as Tendermint
consensus, already do this with optimal properties. One desired
property unique to Ethereum, which Tendermint does not support, is inactivity
leaks, which allow the chain to keep going and eventually recover
even when more than 1/3 of validators go offline. Fortunately, this
desire has already been addressed: there are already proposals that
modify Tendermint-style consensus to accommodate inactivity leaks.

A leading single slot finality proposal

The harder part of the problem is figuring out how to make
single-slot finality work with a very high validator count, without
leading to extremely high node-operator overhead. For this, there are a
few leading solutions:

Option 1: Brute force - work hard on
implementing better signatures aggregation protocols, potentially using
ZK-SNARKs, which would actually allow us to process signatures from
millions of validators in each slot.

Horn, one of the proposed designs for a better aggregation protocol.
Option 2: Orbit
committees - a new mechanism which allows a
randomly-selected medium-sized committee to be responsible for
finalizing the chain, but in a way that preserves the cost-of-attack
properties that we are looking for.
One way to think about Orbit SSF is that it opens up a space of
compromise options along a spectrum from x=0 (Algorand-style committees,
no economic finality) to x=1 (status quo Ethereum), opening up points in
the middle where Ethereum still has enough economic finality to be
extremely secure, but at the same time we get the efficiency benefits of
only needing a medium-sized random sample of validators to participate
in each slot.

Orbit takes advantage of pre-existing heterogeneity in validator
deposit sizes to get as much economic finality as possible, will still
giving small validators a proportionate role. In addition, Orbit uses
slow committee rotation to ensure high overlap between adjacent quorums,
ensuring that its economic finality still applies at committee-switching
boundaries.
Option 3: two-tiered staking - a mechanism where
there are two classes of stakers, one with higher deposit requirements
and one with lower deposit requirements. Only the higher-deposit tier
would be directly involved in providing economic finality. There are
various proposals (eg. see the
Rainbow staking post) for exactly what rights and responsibilities
the lower-deposit tier has. Common ideas include:
- the right to delegate stake to a higher-tier
  staker
- a random sample of lower-tier stakers attesting to,
  and being needed to finalize, each block
- the right to generate inclusion
  lists

What are some links to
existing research?

Paths toward single slot finality (2022): https://notes.ethereum.org/@vbuterin/single_slot_finality
A concrete proposal for a single slot finality protocol for Ethereum
(2023): https://eprint.iacr.org/2023/280
Orbit SSF: https://ethresear.ch/t/orbit-ssf-solo-staking-friendly-validator-set-management-for-ssf/19928
Further analysis on Orbit-style mechanisms: https://ethresear.ch/t/vorbit-ssf-with-circular-and-spiral-finality-validator-selection-and-distribution/20464
Horn, signature aggregation protocol (2022): https://ethresear.ch/t/horn-collecting-signatures-for-faster-finality/14219
Signature merging for large-scale consensus (2023): https://ethresear.ch/t/signature-merging-for-large-scale-consensus/17386?u=asn
Signature aggregation protocol proposed by Khovratovich et al: https://hackmd.io/@7dpNYqjKQGeYC7wMlPxHtQ/BykM3ggu0#/
STARK-based signature aggregation (2022): https://hackmd.io/@vbuterin/stark_aggregation
Rainbow staking: https://ethresear.ch/t/unbundling-staking-towards-rainbow-staking/18683

What is left to
do, and what are the tradeoffs?

There are four major possible paths to take (and we can also take
hybrid paths):

Maintain status quo
Brute-force SSF
Orbit SSF
SSF with two-tiered staking

(1) means doing no work and leaving staking as is,
but it leaves Ethereum's security experience and staking centralization
properties worse than it could be.

(2) brute-forces the problem with high tech. Making
this happen requires aggregating a very large number of signatures (1
million+) in a very short period of time (5-10s). One way to think of
this approach is that it involves minimizing
systemic complexity by going all-out on accepting encapsulated
complexity.

(3) avoids "high tech", and solves the problem with
clever rethinking around protocol assumptions: we relax the "economic
finality" requirement so that we require attacks to be expensive, but
are okay with the cost of attack being perhaps 10x less than today (eg.
$2.5 billion cost of attack instead of $25 billion). It's a common view
that Ethereum today has far more economic finality than it needs, and
its main security risks are elsewhere, and so this is arguably an okay
sacrifice to make.

The main work to do is verifying that the Orbit mechanism is safe and
has the properties that we want, and then fully formalizing and
implementing it. Additionally, EIP-7251 (increase max
effective balance) allows for voluntary validator balance
consolidation that immediately reduces the chain verification overhead
somewhat, and acts as an effective initial stage for an Orbit
rollout.

(4) avoids clever rethinking and high tech,
but it does create a two-tiered staking system which still has
centralization risks. The risks depend heavily on the specific rights
that the lower staking tier gets. For example:

If a low-tier staker needs to delegate their attesting
rights to a high-tier staker, then delegation could centralize and we
would thus end up with two highly centralized tiers of staking.
If a random sample of the lower tier is needed to approve each
block, then an attacker could spend a very small amount of ETH to block
finality.
If lower-tier stakers can only make inclusion lists, then the
attestation layer may remain centralized, at which point a 51% attack on
the attestation layer can censor the inclusion lists themselves.

Multiple strategies can be combined, for example:

(1 + 2): use brute-force techniques to reduce the min deposit size
without doing single slot finality. The amount of aggregation required
is 64x less than in the pure (3) case, so the problem becomes
easier.

(1 + 3): add Orbit without doing single slot finality

(2 + 3): do Orbit SSF with conservative parameters (eg. 128k
validator committee instead of 8k or 32k), and use brute-force
techniques to make that ultra-efficient.

(1 + 4): add rainbow staking without doing single slot finality

How does
it interact with other parts of the roadmap?

In addition to its other benefits, single slot finality reduces the
risk of certain
types of multi-block MEV attacks. Additionally, attester-proposer
separation designs and other in-protocol block production pipelines
would need to be designed differently in a single-slot finality
world.

Brute-force strategies have the weakness that they make it harder to
reduce slot times.

Single
secret leader election

What problem are we solving?

Today, which validator is going to propose the next block is known
ahead of time. This creates a security vulnerability: an attacker can
watch the network, identify which validators correspond to which IP
addresses, and DoS attack each validator right when they are about to
propose a block.

What is it and how does it
work?

The best way to fix the DoS issue is to hide the information about
which validator is going to produce the next block, at least until the
moment when the block is actually produced. Note that this is easy if we
remove the "single" requirement: one
solution is to let anyone create the next block, but require the randao
reveal to be less than 2²⁵⁶ / N. On average, only one
validator would be able to meet this requirement - but sometimes there
would be two or more and sometimes there would be zero. Combining the
"secrecy" requirement with the "single" requirement" has long been the
hard problem.

Single secret leader election protocols solve this by using some
cryptographic techniques to create a "blinded" validator ID for each
validator, and then giving many proposers the opportunity to
shuffle-and-reblind the pool of blinded IDs (this is similar to how a mixnet works).
During each slot, a random blinded ID is selected. Only the owner of
that blinded ID is able to generate a valid proof to propose the block,
but no one else knows which validator that blinded ID corresponds
to.

Whisk SSLE protocol

What are some links
to existing research?

Paper by Dan Boneh (2020): https://eprint.iacr.org/2020/025.pdf
Whisk (concrete proposal for Ethereum, 2022): https://ethresear.ch/t/whisk-a-practical-shuffle-based-ssle-protocol-for-ethereum/11763
Single secret leader election tag on ethresear.ch: https://ethresear.ch/tag/single-secret-leader-election
Simplified SSLE using ring signatures: https://ethresear.ch/t/simplified-ssle/12315

What is left to
do, and what are the tradeoffs?

Realistically, what's left is finding and implementing a protocol
that is sufficiently simple that we are comfortable implementing it on
mainnet. We highly value Ethereum being a reasonably simple protocol,
and we do not want complexity to increase further. SSLE implementations
that we've seen add hundreds of lines of spec code, and introduce new
assumptions in complicated cryptography. Figuring out an
efficient-enough quantum-resistant SSLE implementation is also an open
problem.

It may end up the case that the extra complexity introduced by SSLE
only goes down enough once we take the plunge and introduce the
machinery to do general-purpose zero-knowledge proofs into the Ethereum
protocol at L1 for other reasons (eg. state trees, ZK-EVM).

An alternative option is to simply not bother with SSLE, and use
out-of-protocol mitigations (eg. at the p2p layer) to solve the DoS
issues.

How does
it interact with other parts of the roadmap?

If we add an attester-proposer separation (APS) mechanism, eg. execution
tickets, then execution blocks (ie. blocks containing Ethereum
transactions) will not need SSLE, because we could rely on block
builders being specialized. However, we would still benefit from SSLE
for consensus blocks (ie. blocks containing protocol messages such as
attestations, perhaps pieces of inclusion lists, etc).

Faster transaction
confirmations

What problem are we solving?

There is value in Ethereum's transaction
confirmation time decreasing further, from 12 seconds down to eg. 4
seconds. Doing this would significantly improve the user experience of
both the L1 and based rollups, while making defi protocols more
efficient. It would also make it easier for L2s to decentralize, because
it would allow a large class of L2 applications to work on based
rollups, reducing the demand for L2s to build their own
committee-based decentralized sequencing.

What is it and how does it
work?

There are broadly two families of techniques here:

Reduce slot times, down to eg. 8 seconds or 4
seconds. This does not necessarily have to mean 4-second finality:
finality inherently takes three rounds of communication, and so we can
make each round of communication be a separate block, which would after
4 seconds get at least a preliminary confirmation.
Allow proposers to publish pre-confirmations over the course
of a slot. In the extreme, a proposer could include
transactions that they see into their block in real time, and
immediately publish a pre-confirmation message for each transaction ("My
first transaction is 0×1234...", "My second transaction is 0×5678..."). The
case of a proposer publishing two conflicting confirmations can be dealt
with in two ways: (i) by slashing the proposer, or (ii)
by using attesters to vote on which one came
earlier.

What are some links
to existing research?

Based preconfirmations: https://ethresear.ch/t/based-preconfirmations/17353
Protocol-enforced proposer commitments (PEPC): https://ethresear.ch/t/unbundling-pbs-towards-protocol-enforced-proposer-commitments-pepc/13879
Staggered periods across parallel chains (a 2018-era idea for
achieving low latency): https://ethresear.ch/t/staggered-periods/1793

What is left to
do, and what are the tradeoffs?

It's far from clear just how practical it is to reduce slot times.
Even today, stakers in many regions of the world have a hard time
getting attestations included fast enough. Attempting 4-second slot
times runs the risk of centralizing the validator set, and making it
impractical to be a validator outside of a few privileged geographies
due to latency. Specifically, moving to 4-second slot times would
require reducing the bound on network latency ("delta") to two seconds.

The proposer preconfirmation approach has the weakness that it can
greatly improve average-case inclusion times, but not
worst-case: if the current proposer is well-functioning, your
transaction will be pre-confirmed in 0.5 seconds instead of being
included in (on average) 6 seconds, but if the current proposer is
offline or not well-functioning, you would still have to wait up to a
full 12 seconds for the next slot to start and provide a new
proposer.

Additionally, there is the open question of how
pre-confirmations will be incentivized. Proposers have an
incentive to maximize their optionality as long as possible. If
attesters sign off on timeliness of pre-confirmations, then transaction
senders could make a portion of the fee conditional on an immediate
pre-confirmation, but this would put an extra burden on attesters, and
potentially make it more difficult for attesters to continue functioning
as a neutral "dumb pipe".

On the other hand, if we do not attempt this and keep
finality times at 12 seconds (or longer), the ecosystem will put greater
weight on pre-confirmation mechanisms made by layer 2s, and
cross-layer-2 interaction will take longer.

How does
it interact with other parts of the roadmap?

Proposer-based preconfirmations realistically depend on an
attester-proposer separation (APS) mechanism, eg. execution
tickets. Otherwise, the pressure to provide real-time
preconfirmations may be too centralizing for regular validators.

Exactly how short slot times can be also depends on the slot
structure, which depends heavily on what versions of APS, inclusion
lists, etc we end up implementing. There are slot structures that
contain fewer rounds and are thus more friendly to short slot times, but
they make tradeoffs in other places.

Other research areas

51% attack recovery

There is often an assumption that if a 51% attack happens (including
attacks that are not cryptographically provable, such as censorship),
the community will come together to implement a minority
soft fork that ensures that the good guys win, and the bad guys get
inactivity-leaked or slashed. However, this degree of over-reliance on
the social layer is arguably unhealthy. We can try to reduce reliance on
the social layer, by making the process of recovering as automated as possible.

Full automation is impossible, because if it were, that would count
as a >50% fault tolerant consensus algorithm, and we already know the
(very restrictive) mathematically
provable limitations of those kinds of algorithms. But we
can achieve partial automation: for example, a client could
automatically refuse to accept a chain as finalized, or even as the head
of the fork choice, if it censors transactions that the client has seen
for long enough. A key goal would be ensuring that the bad guys in an
attack at least cannot get a quick clean victory.

Increasing the quorum
threshold

Today, a block finalizes if 67% of stakers support it. There is an
argument that this is overly aggressive. There has been only one (very
brief) finality failure in all of Ethereum's history. If this percentage
is increased, eg. to 80%, then the added number of non-finality periods
will be relatively low, but Ethereum would gain security properties: in
particular, many more contentious situations will result in
temporary stopping of finality. This seems a much healthier
situation than "the wrong side" getting an instant victory, both when
the wrong side is an attacker, and when it's a client that has a
bug.

This also gives an answer to the question "what is the point of solo
stakers"? Today, most stakers are already staking through pools, and it
seems very unlikely to get solo stakers up to 51% of staked
ETH. However, getting solo stakers up to a quorum-blocking minority, especially if the quorum is 80% (so a quorum-blocking
minority would only need 21%) seems potentially achievable if we work
hard at it. As long as solo stakers do not go along with a 51% attack
(whether finality-reversion or censorship), such an attack would not get
a "clean victory", and solo stakers would be motivated to help organize
a minority soft fork.

Note that there are interactions between quorum thresholds and the
Orbit mechanism: if we end up using Orbit, then what exactly "21% of
stakers" means will become a more complicated question, and will depend
in part on the distribution of validators.

Quantum-resistance

Metaculus
currently believes, though with wide error bars, that quantum
computers will likely start breaking cryptography some time in the
2030s:

Quantum computing experts such as Scott Aaronson have also recently
started taking the possibility of quantum computers actually working in
the medium term much more
seriously. This has consequences across the entire Ethereum roadmap:
it means that each piece of the Ethereum protocol that currently depends
on elliptic curves will need to have some hash-based or otherwise
quantum-resistant replacement. This particularly means that we cannot
assume that we will be able to lean on the
excellent properties of BLS aggregation to process signatures from a
large validator set forever. This justifies conservatism in the
assumptions around performance of proof-of-stake designs, and also is a
cause to be more proactive to develop quantum-resistant
alternatives.

Possible futures of the Ethereum protocol, part 1: The Merge

Possible futures of the Ethereum protocol, part 1: The Merge

The Merge: key goals

In this chapter

Single slot finality and staking democratization

What problem are we solving?

What is it and how does it work?

What are some links to existing research?

What is left to do, and what are the tradeoffs?

How does it interact with other parts of the roadmap?

Single secret leader election

What problem are we solving?

What is it and how does it work?

What are some links to existing research?

What is left to do, and what are the tradeoffs?

How does it interact with other parts of the roadmap?

Faster transaction confirmations

What problem are we solving?

What is it and how does it work?

What are some links to existing research?

What is left to do, and what are the tradeoffs?

How does it interact with other parts of the roadmap?

Other research areas

51% attack recovery

Increasing the quorum threshold

Quantum-resistance

评论

Single slot
finality and staking democratization

What is it and how does it
work?

What are some links to
existing research?

What is left to
do, and what are the tradeoffs?

How does
it interact with other parts of the roadmap?

Single
secret leader election

What is it and how does it
work?

What are some links
to existing research?

What is left to
do, and what are the tradeoffs?

How does
it interact with other parts of the roadmap?

Faster transaction
confirmations

What is it and how does it
work?

What are some links
to existing research?

What is left to
do, and what are the tradeoffs?

How does
it interact with other parts of the roadmap?

Increasing the quorum
threshold