It started in 2018 at Devcon4 in Prague. The group of Ethereum core developers gathered and discussed a troubling topic – the problem of ever growing state. A symptomatic manifestation of this problem was the effort required by go-ethereum nodes to perform initial downloading of the state for the new peer joining the network. Back then, it was based on fast sync. It used the fact that Ethereum’s state commitment scheme is a form of a Merkle tree (though not binary, but with 16 children at each branch). Starting from the root hash, which for each block is present in a block header, the new peer queries the root node by its hash (there were such request in devp2p back then). Root node contains hashes of its children. The new peer then makes similar queries for each of the children, and the process goes on recursively from the root of the tree to the leaves, eventually downloading the entire state by using this structure. It ends up downloading all the intermediate hashes of the subtrees – so it is not very efficient compared to what happens these days, which is snap sync in go-ethereum and others (which downloads bunch of leaves at a time instead of one tree node), or Ottersync in Erigon, which is BitTorrent style downloading of the state file overlays (similar to Solana approach). Lets go back to fast sync. The peers that answer those request for merkle tree nodes, continue to move on along the chain. Even though they kept updating the current state every 12 or 14 seconds, they retained some history of the merkle tree of what the state was. This retention of some history was what allowed the new peers to perform fast sync download. However, as the size of the state gradually grew, it became more expensive to retain history which was too deep. Pruning of the history became ever more aggressive. Incidentally, the time required to download the state also grew. You may see now where the problem lies. Eventually we get the situation more and more often, when the new peer cannot proceed with the fast sync because all other peers pruned some bits of the Merkle tree it was downloading. The process fails. But not all is lost. If the new peer keeps what was downloaded, and restarts from the freshly taken root hash, it progresses faster, as it already has some parts of the tree from the failed attempt. I suppose eventually it will succeed. But still, the situation was quite unsettling.
It was decided to attempt to fix the situation by initiating State Rent research, and I volunteered to lead it.
The main idea behind state rent is that in order to restrict the growth of the state or perhaps even to reduce it, one introduces a sort of standing charge for all the items in the state. This charge is deducted on a regular basis. Whenever an item is exhausted its balance due to these charges, it gets automatically expelled from the state. My task was to research potential ways to implement this idea in Ethereum. Over a period of few months I have produced three proposals, iterating on each other, and incorporating the feedback received from the previous ones.
Lets us go back a little and discuss what is the state in general terms and why most of the blockchain systems since Ethereum have it. State is present implicitly in all blockchains. However, most notably in Bitcoin, its physical representation is not usually explicitly defined. What is defined are the operations that blockchain has to be able to perform efficiently on the implied state. For example, in Bitcoin, one needs to be able to quickly find unspent outputs, mark outputs as spent, and insert new outputs. Therefore, to the best of my knowledge, Bitcoin peers cannot download state from one another in a standard way, they can only download blocks (with transactions in them) and replay them to calculate the state. Of course, in reality I am sure there are snapshots of Bitcoin core database directory, which is synced up to certain blocks. Maintenance of such snapshots are outside of the Bitcoin protocol, and most probably outside of the official Bitcoin implementation. State in Ethereum is defined more explicitly. Apart from the requirements to efficiently find accounts by their addresses, and storage locations within the accounts, and modify these things, Ethereum also requires the efficient computation of commitment, from the logical structure of the state. This still does not fully specifies, but restricts possible physical representations of state. For example, one may chose to store the state as a merkle tree with accounts and storage items in the leaves. Or one may chose to store accounts and storage items separately from the structure of the Merkle tree. If we include devp2p networking specification, it does require efficient retrieval of either nodes of the merkle tree (as in fast sync), or subtrees of the merkle tree (as in snap sync).
If the downloading of the state by the new peer becomes an essential operation, this puts more constrains on the design of the state representation. This was one of the main reason of transition from Erigon 2, which did not support such downloading and had to replay the entire blockchain, to Erigon 3, which supports efficient state downloading. In such designs, anything else that is required to resume the following of the blockchain and other operations, should be thought as a part of the state. For example, in the early versions of Cosmos Hub, the commitment was calculated using self-balancing Merkle trees. Unlike radix tree, used in Ethereum, self-balancing tree does not possess the property of unique representation. This means that if one is given just the set of accounts making up the tree, it is not possible to uniquely determine the self-balancing merkle tree for it, there are many possible way of building such a tree from the same set of leaves. In this cases, some kind of description of which specific shape of the self-balancing tree the system ended up forming, must be considered part of the state. If Cosmos Hub, having such commitment scheme, were to develop state downloading mechanism, it would require not just downloading list of accounts, but also information describing in which specific way these accounts need to be organised into a self-balancing merkle tree. It is not a show stopper for downloading mechanism, but needs extra consideration. In fact, it is my opinion that in the long run, self-balancing trees are better than radix trees for state commitments.
State rent for accounts with just balances and no storage is relatively simple to design. There are two main questions to answers. Firstly, how much the rent per account needs to be at any given time. And secondly, how does the mechanism of subtracting the rent (and burning it) should be. In my first proposal I wrote that the rent per account should be a function of total size of the state. For example, no rent when the state is really small up to certain size, and after that, the larger is the state, the larger is also the rent. As for the mechanism, I proposed to add an extra structure to the state (that it why we have just had a discussion on how the state is defined), which is the priority queue of references to the accounts, where the lower is the remaining balance, the higher is the priority. Whenever balances of accounts change, their position in the priority queue also changes, and at every block there is a system operation to pop some accounts from the front of the queue, in order to expel them. This approach allows not to subtract the rent explicitly for every account regularly, which would be a lot of changes.
However, in Ethereum, accounts may also have variable number of storage items associated with them. And those storage items do not hold the balance of their own. Also, even though they are modifiable by the actions of the owning account only, it is a convention that they often represent the logical ownership for some other accounts. This logical ownership is not explicitly defined and in general hard or impossible to compute. And finally, the programming model as a convention, and the Solidity (and I am sure Vyper) compilers assume that storage items cannot just disappear. If they do, a lot of smart contracts that were written, will fail in very weird ways. After three rounds of proposals and feedback, I came to the conclusion that the only way to get this done is to change the programming model as a convention, and the compilers. Changing programming model as a convention would mean going through most used smart contracts written, re-writing them with the new convention, which is rent-safe. It would involve mostly social work, and a bit of the technical work, over prolonged period, without any chance of success. After realising it, I decided to stop the work on the state rent for Ethereum.
After that came a pivot towards Stateless Ethereum. The main idea is that blocks have to include, apart from sender and receiver addresses, and the input data for each transaction, the merkle proof for all the items in the state, that transactions in these blocks are going to access. This is because Stateless Ethereum assumes that the verification of such blocks should proceed without any knowledge of the state except for the state commitment, or root hash of the merkle tree. How does this help? First, let’s see what kinds of peers will be able to benefit from that, and then we shall see what is the cost. Assume this categorisation of Ethereum peers:
1. Block producers
2. Transaction creators
3. Transaction relays
4. RPC servers
5. Others
Stateless Ethereum does not benefit block producers, because those would be the peers that still need to access the full state to execute transaction, and generate merkle proofs to go into blocks. It does not benefit RPC servers either, assuming that the most common RPC request is `eth_call`, which requires unconstrained state access. Transaction creators do not need access to the state, unless they need to perform some simulation to calculate inputs. Transaction relays, although not requiring the access to the entire state (but they might do in the case of full Account Abstraction), still need actual account balances and nonces to validate transaction before relaying, so in general they also do not benefit. Which leaves us with other peers, who may potentially benefit when validating the stateless blocks. Given all that, the benefits are not very clear.
The main cost of Stateless Ethereum comes with the additional data in form of merkle proof that need to be attaches to every block. Worst case scenarios are very scary, average cases are perhaps bearable. There was an observation that if the merkle tree used for commitment was binary rather than with radix 16, the merkle proofs would be smaller. Although the depth of the tree would be roughly 4 times bigger, making merkle proofs having roughly 4 times more elements, each element would contain 1 sibling hash instead of 15. Therefore, the size of the merkle proofs should reduce roughly by the factor of 15/4 = 3.75. Further development of this theme lead to the work on Verkle tree commitments, which offer significant reduction of the proof sizes. As far as I understand, the research is rolling back to the binary merkle trees.
There are two other technical challenges to overcome. Extra gas needs to be charged for the merkle proof, since it makes every block larger. But there is one proof per blocks, whereas each transaction pays its transaction fees separately, and often at different rate of ETH per gas. Secondly, merkle proof of bytecode of a contract would require including the entire bytecode into the proof. We will not dive into possible solutions, because on balance it seems that cost/benefit ratio for stateless ethereum simply does not work out.
Improving on the pure Stateless Ethereum design, I have proposed a middle ground design, called Regenesis. In order to explain why this is a middle ground, we need to introduce the notion of implicit and explicit state. We will call implicit state the state that we have right now, and it grows, changes, and shrinks in exactly the same way as the state does today. Explicit state, at the time of the first regenesis event is empty. Unlike in Stateless Ethereum, now transactions, not blocks carry merkle proofs. Such merkle proof need to include only the state items that a transactions accesses but that are not yet part of the explicit state. If a transaction is included into a block, the items than it proved become part of the explicit state, regardless of whether the execution of this transaction succeeded or failed. This is how the explicit state grows. If a transaction only accesses items that are already in the explicit state, no proofs are needed. Now you can hopefully see why regenesis is a middle ground between stateful and stateless ethereum. Since it is not possible in general to predict what a transaction may access, a situation may arise when a transaction tries to access items that are neither in the proof nor in the explicit state. In this case, the transaction fails, but still pays for its gas, and all items it proves are added to the explicit state. This means that, if resubmitted with updated proof, it will eventually succeed. Renegesis events happen regularly, and after each event the explicit state is emptied again. Let’s analyse regenesis in the same way we analysed stateless Ethereum. For benefits, we look at the same categories of peers. Block producers benefit because they only need to maintain the explicit state, transactions come with the proofs of anything which is outside of the explicit state. Transaction creators do not benefit, they actually bear most of the burden of regenesis. Now they need to have access to potentially the entire implicit state to produce merkle proofs for the transactions. However, in many cases such access is not time critical, and can afford extra means for retrieval, for example, Portal Network, which we will touch on later. Transaction relays would in general need implicit state to re-verify the balance of the account sending the transaction, but this check can be relaxed and transaction relay simply utilise the balance and the nonce either from the proof or from the explicit state. So transaction relays also benefit, at the cost of relaying larger transactions, because they include merkle proofs. RPC servers with `eth_call` requests in general do not benefit, because they still require access to the entire implicit state to simulate potential transactions. Other peers would benefit in a similar way as in the case of stateless Ethereum. In comparison with stateless Ethereum, block proposers are winners, and transaction creators are the losers. Regenesis represents this distribution of burden. As mentioned before, transactions creators would benefit from joining the Portal Network to share this burden amongst themselves. The technical challenge of charging gas for proofs gets easier, since each transaction can pay for its own proof. Another challenge of proving the byte code still exists, and therefore, the change of the state commitment scheme is required one way or the other.
Briefly about Portal Network project. Portal Network nodes form a few different subnets, each having a function of distributing block headers, block bodies, pieces of state, receipts, state history, and so on. Here we are interested in the state subnet. Portal forms a content-addressable distribution of data, and its peers can ask and be asked to retrieve a particular merkle tree node, account, storage item, or bytecode by specifying their tree paths in the merkle tree. All responses are furnished with corresponding proofs, thus minimising trust assumptions. Each peer does not need to store all the data items, they are spread out using kademlia structure. Queries and responses are transmitted via UDP rather than TCP, making the network much more transient and flexible than devp2p or libp2p networks. The stated use case for the Portal Network is lightweight clients. And indeed, in our analysis, transaction creators would benefit from such a network. It becomes a “saving grace” of Regenesis for them. Other potential users of Portal network would be L2 rollups, both for verification of data on L1 and posting it there. Similar can be said about the operators of trustless bridges.
In 2020, I ended my involved in the stateless Ethereum research group, once it became clear to me that Regenesis is replaced by the new proposal, called State Expiry, which however great, I did not believe at the time would be adopted. Back then, my skepticism was based on the observation that state expiry was becoming much more complex scheme than regenesis. Regenesis was perhaps already too complex to ever be adopted, therefore, I thought, state expiry had no chance at all. When I look at it now, there are two more reasons that is added to my scepticism. Now that the details of State Expiry are more less clear, one can see that like State Rent, it would break the conventional programming model for smart contract, and can only be rolled out with the change of the said programming model. And, looking at the issue more broadly, it aspires to never remove things forever, always leaving mechanisms for resurrection, the desire which made State Rent proposals almost intractable. I now call this the “timeless treasure fallacy” – the belief that blockchain should give you a way to preserve your treasure forever, without you having to do anything about it. This facility never existed in the real world.
Where should we go from here? I think if Ethereum, for whatever reason, want to reclaim primacy as a smart contract platform, and compete with Solana, instead of becoming just a rollup aggregator, it needs to develop a strategy (it is not actually very difficult to do, I believe) to perform two fundamental changes. Number one – fix incomprehensible state problem. This would require changing the organisation of the state. Most probably to adopt a similar model to Solana. That means that in EVM, instead of using `SLOAD` and `SSTORE` instructions to interact with the state that looks like a hash map, one would perform memory-mapping from account’s storage into EVM memory. This memory mapping would be hard to inject into the current incarnation of EVM, but it would be much easier with EOF. This is because special data sections can be assigned to the memory that is mapped to the linearised storage.
Number two is fixing state growth. Introducing State Rent, or State Expiry are both infeasible from my point of view, without relaunching the chain. If the chain is to be relaunched, one has to make a high-level decision whether “timeless treasure” does have a place in it or not, and design accordingly. Somewhat more realistic path is to do Regensis plus Portal Network, via change of commitment scheme, for example, using the design of Verkle trees, which does solve the issue of proving the bytecode. However, I will say it again, without strategy, and the accompanying narrative, if you will, these efforts are very unlikely to succeed.
Beyond the question what Ethereum will do, there is a topic of hybrid public/private state, which I will not discuss today, and leave for some other time. Our colleague Mark has recently made an observation that thinking about explicit vs implicit state leads to some insights about how a hybrid state may be organised, and continuing this thought, we are arriving to the generalisation of the entire ensemble of L1 with the L2s, but realised in the single system.
Original video on the Akexey Akunov Monoblunt channel on Telegram: https://t.me/monoblunt/128
Original notes on the Akexey Akunov Monoblunt channel on Telegram: https://t.me/monoblunt/129