Celer Network is the world’s only and first generalized state channel network launched. Since its alpha mainnet launch in July 2019, Celer Network has processed more than $1.7M transactions for 12,000 users around the globe.
Today, we are excited to announce the release of the full open-source protocol specification of Celer Network’s state channel. You can access the specification in CelerCore doc. To support the growing and large-scale production use, we leverage a layered and end-to-end design principle in the design and implementation. In this blog, we will talk about these key design principles through a high-level walkthrough of the newly released open-source specification. We look forward to the community’s feedback and questions!
A Layered and Decoupled Architecture
Celer network is designed around a three-layered and decoupled architecture as shown in the picture above.
In the lowest layer are the state channel smart contracts that are responsible for holding the fund for state channel networks, resolving conditional payment and adjudicating state channel application disputes. State channel clients and full nodes will interact with these on-chain smart contract to open channels, deposit, withdrawal and settle channels with uncooperative counterparties.
In the middle is a generic payment network, also known as CelerPay Network. This network consists of client nodes and off-chain service provider nodes (OSPs) (i.e., full Celer Network nodes). The main difference between Celer’s generic payment network and simple payment networks like Raiden or Lightning is that it can support the functionality of multi-hop conditional payment protocol.
To explain what a conditional payment is, let’s take a look at an example. When Alice sends a conditional payment with X ETH to Bob across multiple OSPs in Celer Network, the X ETH won’t be immediately available to Bob but is in a “committed” state with conditional dependency that will be resolved later. The condition can be literally anything that you can verify and get result on-chain: Oracle input, NFT ownership change, rollup chain state and finally and of course, state channel application’s final outcome states. The final amount that Alice sent to Bob can fall in the range between 0 to X and is determined by evaluating the outcome of the conditional dependency using any arbitrary function.
In other words, sending a conditional payment that conditionally depends on the outcome of a state channel application in layer-2 is equivalent to sending funds to a smart contract application in layer-1. Therefore, we call this process of sending conditional payment the fund allocation process in the state channel applications.
The third layer is the application layer, i.e., CelerApp. In the context of the generalized state channel, conditional payment often depends on state channel program outcomes. A state progression protocol between state channel application participants is needed to start an application with an initial state and then progress the state all the way to the final state either through collaborative mechanisms or adjudication in a dispute. In Celer Network, we provide frameworks for application developers to easily write smart contracts that can run off-chain as generalized state channel applications by simply implementing some standard interfaces in a turn-based state-machine smart contracts. With these interfaces implemented, all the state progression and state adjudication will be automatically handled by the Celer Network software without any additional logic from the developer.
The design goals for this state channel networks are:
- Low cost
With these goals at the heart, we have the following design concrete principles to guide the entire protocol and system design:
Minimize on-chain footprint
On-chain transactions are expensive and slow, so the top priority is to make state transitions always stay off-chain, and only resort to on-chain operation when absolutely necessary (e.g., when two parties disagree on a state, or deposit/withdraw fund to/from the channel).
Another important aspect is to minimize the cost of each on-chain operation through thorough smart contract optimization. Some common practices (ranked by importance) include avoiding contract deployment by users, minimizing the number of transactions per business-level operation, and minimizing on-chain storage per channel.
Minimize relay node on-chain interaction
One essential lesson we learned from building various large-scale robust distributed systems is to push the complexity to the edge, which also applies to building a robust state channel network. The off-chain protocol should make sure that the intermediate nodes have minimal possibilities to perform on-chain transactions or view function calls when relaying off-chain payments, and push the responsibilities of on-chain dispute to the end-to-end payment sender/receiver and app users. This end-to-end principle makes the network much easier to scale and defend against malicious attacks.
Minimize on-chain view calls
People may have the misconception that on-chain view calls are free and can be used at any time. Actually, on-chain view calls can quickly become a bottleneck if not used carefully when running an off-chain service in productionmainly for two reasons: 1) from a cost perspective, most entities have to rely on other services like Infura or Alchemy to view on-chain information, and have to pay them in order to receive a high-throughput and reliable service; 2) from a performance perspective, on-chain view calls are often much slower than fetching data from local storage servers or exchanging information with state channel peers.
Minimize off-chain communication overhead
Though off-chain communication is much faster than on-chain interaction, it still needs to be optimized for better performance and reliability. There are two major overheads to be minimized for each business-level operation: the message round trip and the storage IO. The details of the off-chain protocols will be explained in greater depth in later chapters about CelerPay off-chain and CelerApp.
Enable low-cost decentralized system upgrade
One common issue about the layer-1 blockchains is the great difficulty to upgrade since it requires global consensus. State channel has the advantage to be able to perform per-channel upgrade as long as all the channel peers agree. However, if not well designed, this upgrade process may still be annoying, which could involve multiple on-chain transitions and long off-chain downtime per channel. Celer enables zero-downtime and low-cost system upgrade through structured contract design and the use of protobuf, which will be detailed in later sections.
The Powerful Conditional Payment Primitive
We aim to extract a single low-level primitive data structure that is cleanly defined, highly composable and can be implemented for high performance out of this decoupled architecture. After multiple design iterations with lessons learnt from the real-world, we have converged the low-level primitive of single-hop conditional payment and simplex channel states in the CelerPay Network. This primitive is very expressive: it can express familiar concepts in payment networks such as hash time lock, atomic swap as types of conditions and can also support all kinds of flexible conditions such as state channel application states, Oracle output, and more.
Core Data Structure
The figure above shows the logical data model of a CelerPay state channel between Alice and Bob. Note that not all information needs to be stored on-chain. The model consists of two types of data about the payment channel peers:
- On-chain states shown in the solid boxes, mainly including peer addresses and amount of tokens that have been deposited and withdrawn. Both the on-chain smart contracts and the off-chain peer nodes maintain a full record of the on-chain states.
- Off-chain simplex states shown in the dashed boxes, mainly including the amount of the tokens that have been transferred between peers, their pending pays, and the state sequence numbers. The peers pay each other off-chain by co-signing new simplex states. The contracts do not store the full simplex states, only take them as function call inputs for critical on-chain operations.
CelerPay uses the full-duplex channel model for performance reasons. By decoupling the shared off-chain states into two unidirectional simplex states, two peers can pay each other concurrently. This significantly simplifies the off-chain protocol and improves the off-chain payment throughput.
In the simplex states, there are a lot of “pending pays.” These pending pays are conditional payments that are not resolved yet.
For each conditional payment, the key components are the Conditions and the Transfer Function. One can think of the conditions as effectively function pointers and input parameters to these functions to get outcomes out of these dependent applications. After the outcome is reached, these outcomes (returned bytes) will be passed to the Transfer Function where they will be interpreted into the result of the conditional payment (i.e., how much of the payment).
There are some important “built-in” Transfer Functions. Taking the BooleanAnd function as an example, this function assumes that each condition’s outcome is a boolean value and will resolve the conditional payment to pay the full amount if and only if all returned outcome are True. More details on the built-in Transfer Functions can be found in the CelerCore doc.
With these core data structures, we have designed a highly efficient protocol suite to set up, resolve, settle and concurrently send conditional payments. Each of the processes is highly optimized for the lowest amount of messaging overhead. These protocols and messages are described in detail in the CelerCore doc.
Similar to all communication protocols, if the sender always waits for the response of the previous request before issuing the next request, then the total throughput would be significantly limited by the round-trip processing time. A natural step forward is to improve the off-chain performance with the sliding window protocol.
One particular challenge of using the sliding window is that the message for the simplex state updates (including CondPayRequest and PaymentSettleRequest) is fundamentally different from the packet of the common data transmission protocols such as TCP. Unlike the TCP packets, simplex state update messages are strongly correlated: the simplex state in one message is always based on the state in the previous message. Therefore, one invalid message will invalidate all the subsequent messages based on it. CelerPay has modified the traditional sliding window protocol to tackle this challenge so that the state update requests can be sent and processed at a much higher speed.
With both the core data structures and the messaging protocol, we have enabled a protocol that has achieved simplicity, full duplex and high-throughput.
Multi-hop Protocol: End-to-end Design Principle
With the powerful composability of single-hop conditional payment, we can build multiple different multi-hop payment protocols. When designing multi-hop protocols, we follow the end-to-end design principle.
The end-to-end design principle has been guiding the Internet evolution. The original premise of the end-to-end design principle is that one functionality should not be implemented in the network provided that this functionality has to be checked and confirmed by an end node (client). An example of the end-to-end design principle is data transfer reliability. There were debates in the early days of Internet design regarding whether the routers in the network should try to reduce package drops and help achieve a reliable delivery abstraction to the end points. However, to support something seemingly simple like reliable delivery in-network, a galaxy of issues pops up ranging from physical delivery, fair queuing in the network, congestion suppressing, and more. In the end, there is no practical way for end-points to actually avoid checking for the end-to-end data delivery reliability. Therefore, the early-day Internet community reached a consensus to remove reliability requirements from the network and implement a Transport Control Protocol (TCP) to ensure end-to-end data delivery. With this decision, the community came to the highly scalable, flexible and composable TCP/IP architecture known as Internet today.
To build a high-performance, reliable and low-cost state channel network, we also employ this design principle with the following two manifestations:
- Decouple the multi-hop funding allocation and the application states progression.
- Push the complexity of the conditional payment setup, resolve and settlement to the edge of the multi-hop network.
Composing from the single-hop conditional payment primitive, we currently support two most-commonly used conditional payment protocols out of the box: boolean conditional payment and numerical conditional payment.
Boolean Conditional Payment allows clients to send each other conditional payment that depends on a boolean-value outcome. If and only if the condition outcome proves to be True, for example, an Oracle contract’s result asserting that tomorrow’s temperature will be higher than 72F, or a state channel outcome claiming that Alice is the winner of a state channel Chess game, the conditional payment will resolve to fully paid. Otherwise, the payment will be fully cancelled.
It is important to note that boolean conditional payment follows our design principle and has the following properties that are not available in any other state channel protocol fund allocation process:
- Simplicity: Relay nodes never care about application or condition logics.
- Low-cost: Relay nodes never send on-chain dispute for any payment.
- Secure: Relay nodes are resilient to arbitrary malicious application or condition logic.
- Robust: Relay nodes never need to monitor on-chain conditions or payment states actively.
- Low messaging overhead: Relay nodes never modify any conditional payment messages. The number of message exchanges is optimized for both cooperative and dispute cases.
Numeric Conditional Payment allows a more flexible condition resolution mechanism. It expects the outcome of the depending condition to be a numerical value between 0 and the maximal value. With numeric conditional payment, one can build highly flexible application logics such as a “virtual channel” application where two end-to-end parties can directly send payment back and forth without involving middle nodes.
Numeric conditional payment also maintains the same simplicity, low-cost, robust, and low messaging overhead as in the Boolean Conditional Payment protocols.
Celer Apps: Simple and Flexible
With this decoupled architecture, any blockchain smart contracts, state channel or not, can become a CelerApp contract by exposing two functions for the CelerPay to use as payment condition: isFinalized returns whether the app state outcome is finalized; getOutcome returns the boolean or numeric outcome of the app in the case of the optimized common cases as discussed in the CelerPay Off-Chain Protocol.
The Solidity interfaces for CelerApp contracts are shown above. Each function takes a single argument in generic bytes format so that arbitrary query logics can be supported. These two state query functions are the only requirements from CelerPay, which has nothing to do with how the app states are progressed and resolved. Therefore, a CelerApp can be a turn-based state channel application, or an on-chain application like Chainlink Oracle, or even a roll-up chain application state.
Note on security
A common question on this decoupled architecture is the security assumption of the applications on which conditional payments depend. If the interface is so minimal, how can the allocated fund be secure? What happens if the application logic presents Byzantine behaviors, e.g., isFinalized returns true and then later flips to false, or getOutcome returns non-deterministic results? These are indeed challenging in the system design.
We would like to highlight that through our careful design of the off-chain messaging protocols and data structure primitives, CelerPay generic conditional payment network puts no risk on the reliability and security of the CelerApp interfaces and thus is resilient to any Byzantine behavior of the applications.
A Turn-based State Progression Protocol
With this flexible interface, we also provide a framework specifically for state channel applications. More specifically, we provide two frameworks with turn-based state progression for developers to build interactive and real-time state channel applications:
- Single-Session App is mostly used as a one-time virtual contract for fixed players without initial deployment. The player who wants to bring the off-chain game to on-chain dispute needs to first deploy the contract through the VirtContractResolver.
- Multi-Session App supports multiple groups of players that are playing games on the same contract. It is usually initially deployed once by the developer and then repeatedly shared by all players. No additional code needs to be deployed when a player wants to dispute on-chain. Multi-session app is more suitable for popular games because the (expensive) contract deployment is a one-time process, while the dispute cost for each player is much lower.
Using the contract templates together with the CelerNode SDK that takes care of all the off-chain state channel logics, the CelerApp developers can just focus on the app-specific logic without being hassled by any state-channel-related logics such as signature verification, sequence number tracking, or state machine management. To learn more about the application template, please check out the CelerApp doc in CelerCore doc.
This blog provides a high-level summary of Celer Network’s open-source protocol specification which aims to build a high-performance, secure, low-cost and flexible state channel network. To learn more technical details, please check our CelerCore documentation.
We welcome any questions in our developer discord channel and if you are interested in speeding up your application on Ethereum, please reach out to us at email@example.com!