[updated by author on 15/05/2023]
The ultimate motivation behind public blockchains lies in the concept of decentralisation. This concept refers to empowering individuals by distributing control such that no single entity can exert control over the majority. If consensus can be achieved through a mechanism that ensures no invalid change happens on the ledger, then decentralisation is possible at any scale, including globally. When interacting with public blockchains at this scale, it becomes a right for any individual to verify what is happening on-chain without requiring trust. Primarily, in Ethereum, there are two methods for interaction, either running a full node or relying on a third-party service like Infura or Alchemy. For the average user, these are not ideal ways to interact as the former requires a considerable amount of hardware and storage, and the latter involves trust in the service. A resource-friendly and trustless alternative is to run a light client. This article will explain in-depth what light clients are and their benefits to the Ethereum ecosystem.
Note - Before reading about light clients, it will be helpful to understand what proof-of-work and proof-of-stake is.
Contents:
FAQ
Introduction
Sync Committee
Initialisation
Syncing
Tracking Header
Benefits for the Ethereum ecosystem
Conclusion
FAQ
What are light clients?
To avoid nodes with access to insufficient computational resources not being able to participate in the Ethereum network light clients are proposed to allow for light nodes. Currently there exists clients to allow for full nodes, where a full node will keep track of a subset of the history of the entire blockchain and propagate transactions and blocks. Light nodes would instead rely on the information that full nodes hold (without having to store or process the same information themselves) to keep up to date with the state of the blockchain. Only nodes that do hold a current and accurate state of the blockchain can interact with the chain (though they can’t form consensus). This pattern assists light nodes in avoiding extreme memory and storage requirements that are not possible on devices such as mobile phones and IoT devices.
What are sync committees?
Consisting of 512 validators, sync committees are available for and responsible to the signing of headers of recently attested blocks. Changing every 8192 blocks (to guarantee no validators accrue too much authority) in the form of a sync period, each new committee is determined by a cryptographic random seed. The sync committee signature saved to the fields of each block are an important and concise form in which other nodes, especially light nodes, can determine information such as the currently ‘correct’ chain and how many blocks have elapsed etc. Each sync period that has elapsed since the last check-pointed block a node has access to requires a separate sync.
Introduction
In short, a light client keeps track of certain roots, belonging to a block header. This allows it to trustlessly verify whether a piece of data belongs to a larger set of data. For example, the data might be a transaction hash, and we can check if that hash is included in a block. You might be wondering: How can the light client check whether the roots are authentic? This article will attempt to explain what the light client relies on to determine the authenticity of the roots.
Today, Ethereum is using the Proof of Stake mechanism for securing it in an accountable manner. Light clients rely on a group of validators, called Sync Committee, to determine the authenticity. Before Ethereum’s transition from Proof of Work, the developers wanted the new PoS chain (also known as Beacon Chain) to be friendly for light clients. With the introduction of sync committees in the Altair hardfork, light clients can easily track the roots in block headers and remain synced as new blocks are added. Let’s do a deep dive into what a sync committee is, and why are they needed?
Sync Committee
A sync committee, made up of 512 validators, is responsible for signing the headers of newly attested blocks. It changes every period that lasts roughly 27 hours (27 hours, 18 minutes, and 24 seconds to be exact), equivalent to 8,192 slots where each slot is 12 seconds long. The selection for the next sync committee is based on a random seed to minimise the probability of having a corrupted set of validators. Every new block contains a sync committee signature and bitfield, saved directly into its fields. Using this information, the light client can keep track of the latest header in an environment limited by resources. These environments include browser extensions, mobile phones, IoT devices, and low-powered laptops.
The reason for needing the sync committee is because without them, the light client would need much more computation resources. It would be a very expensive operation to run the proof-of-stake algorithm for verifying the votes of all active validators. The count of verifying these votes would be considerably higher compared to verifying a single signature created by the sync committee. So, by relying on Sync Committee participation, it drastically reduces the amount of data light clients need to process.
Initialisation
To sync to the latest block header, the light client needs to know the current sync committee. That means knowing all the public keys of validators part of that committee. This information, along with proof, is requested from a full node by supplying a trusted checkpoint root. This checkpoint, also known as a weak subjectivity checkpoint, is the Merkle root of a block that is agreed subjectively as part of the canonical chain. The proof is a Merkle branch that proves whether the sync committee provided by the node is part of the chain. After sending the request, the node responds with a snapshot containing the following information stipulated below. If the verification of the branch is successful, the light client stores the snapshot and fetches committee updates until it reaches the latest sync period.
Depiction of the transfer of knowledge that would occur between a light node and a full node when the former wishes to sync to the latest block header.
Header
Block’s header corresponding to the checkpoint root.
Current Sync Committee
Public keys and the aggregated public key of the current sync committee.
Current Sync Committee Branch
Proof of the current sync committee in the form of a Merkle branch.
Syncing
Depending on the recentness of the checkpoint root, the light client needs to sync from the checkpoint’s sync period to the latest sync period. For example, if the provided checkpoint root is for an old block at slot 2000 and the current block is at slot 18,384, the light client fetches updates for two sync periods. Because there is a difference of 16,384 slots between 2000 and 18,384, and there are 8192 slots in a sync period, that means two sync periods have passed. Therefore, the sync committee has changed twice, and the light client needs to know these changes by fetching two updates. An update is similar to a snapshot, however, it contains information about the next sync committee and additional fields.
Any light node will need to update at the end of each sync period when a committee changes to reach the current block from the latest checkpoint root they have knowledge of.
Attested Header
Header that will be accepted if the update is valid.
Next Sync Committee
Public keys and the aggregated public key of the next sync committee.
Next Sync Committee Branch
The Merkle branch of the next sync committee.
Finality Header
Header that was signed by the current sync committee.
Sync Committee Aggregate
Contains the sync committee’s bitfield and signature required for verifying the attested header.
Sync Committee Signature
An aggregation of every validator that signed the attested header.
Fork Version
A four-byte value that is used for signing the attested header.
Tracking Header
After syncing to the latest sync period, the light client starts requesting a header update for every slot. Upon receiving, it verifies the header by authenticating its corresponding sync committee signature and checks the bitfield to ensure enough participants had signed it. If this validation is successful, it stores the received header and awaits a new slot. From this point, the light client can trustlessly verify the upcoming headers of blocks without running complex algorithms or requiring excessive storage. Asynchronously, it maintains a local clock for the current slot, epoch, and sync period, knowing exactly when the sync committee changes and to request a header update.
Benefits for the Ethereum Ecosystem
Currently, there are two ways to interact with a smart contract deployed on Ethereum, either running a full node or relying on third party services that provide access to a full node. The full node is need to broadcast the user’s transaction to the network so that it included in a future block. Running a full node solely for interacting purposes creates a difficult user experience because it takes a long time for it to sync and requires terabytes of storage space for blockchain data. This situation has led to a growing dependence on such services, as they remove the heavy storage requirement from end-users. Nonetheless, using these services involves a trade-off between privacy and minimised trust. This case makes light clients ideal for usage as they could remove the unnecessary costs and provide trustless access for interacting with Ethereum.
Conclusion
Today, there are multiple efforts happening in parallel to build light clients for Ethereum. There are protocol researchers working on the specification for improving the security of sync committees. Also, a notable effort is happening towards building the Portal Network which will be useful for light clients to access historical data, send transactions, and more. Light clients, currently, provide a basic functionality of syncing from a trusted checkpoint to a recent checkpoint for tracking sync committee participation. This can be seen by interacting with Lodestar’s light client demo or running Helios.