Maybe light at the end of the tunnel.
Hello everyone, It is Giulio again. And this is another one of my blog posts. Today I am going to talk about the Erigon-CL project endgame, the current state of the art and how far away I am to have a minimum working product. I will also be discussing about issues that are arising with Erigon CL lightclient and how soon it may be replaced by a minimal full client (possibly).
Erigon-CL “Endgame” Structure.
First of all, I finally came up with a possible Erigon-CL final application layout which in a nutshell is an hybrid staged sync. For people who have never run an Erigon node, the staged sync is the sync methodology by which Erigon syncs up. It basically process different actions that the node needs to perform in order to sync up in sequence rather than at the same time. For example: Downloading blocks, executing… all separate tasks. This is useful in the EL counterpart because it allows “localized” improvements to the different stages without having to worry about breaking other components. In the CL, this is also a viable architecture and can yield possibly the same benefits, however the CL does way less indexing and data transformation that the EL does. On top of this, the Ethereum 2.0 algorithm has quite a contorted and complex fork choice rule to handle reorgs and side forks, which makes staged sync, possibly the worst way to stay in sync as everything is implied to be done together by the protocol itself. However some parts can be divided in stages however not in the same way as Erigon EL.
Above is the “End game” structure. So to explain it simply: now we have a one-way staged sync which retrieves and compute historical data and historical data ONLY. it is “one-way” so once it is done it is never going to get touched again. After that we will start the proper full client service which will apply fork choice as per specifications. Very interestingly: what is now happening is that “Engine API” is not on the Json-rpc of the EL but on the gossip of the CL. This architecture is subject to change but it seems quite clean as-is and it is the cleanest implementation methodology I can envision as of now.
Erigon-CL: State of the art.
Now to the progress itself, so Erigon-CL can stay in Sync just okay with the beacon chain but cannot handle reorgs and bad blocks. It can replace EL’s P2P protocol already as there is already a working prototype of that. However that requires some polishing. We now pass all consensus-tests on Altair and Bellatrix except for transition/fork-choice and ssz_static because I have not implemented proper handlers for them. Regarding other forks like Capella and Phase0, I did not implement them yet because of laziness(for Capella) and because Phase0 has a tricky beacon state structure.
Consensus Layer Lightclient: replacement and issues.
Somewhat soonish we may be forced to remove lightclient in favor of a minimal full client and this is because we observed that there seems to be not enough Nimbus nodes for all Erigon CL lightclient nodes. The situation is especially problematic in testnets such as Goerli and Sepolia, while a bit more tame on mainnet. Regardless, I found out, on the Ethereum mainnet, that a lightclient node’s majority of peers is other Lightclient nodes which ends up pretty much just passing messages around from the lucky Lightclient connected to a Nimbus node. So the Lightclient server situation is not bad for now on mainnet but it is getting worse as there are not enough nimbus nodes around. So, since I am so close to have a working minimal version of the full client, why not just replacing lightclient altogether?
Above is how much is going to be built in order to provide a minimum working full client. Pretty much everything except historical data and just start the Full client service from Checkpoint sync. In practice, nothing will change for the end user except for perhaps some better security assumptions.
This is all.
original post: https://giulioswamp.substack.com/p/04032023-erigon-cl-endgame-structure