Optimize Godwoken finality and on-chain cost

jjy · September 8, 2022, 2:20pm

Abstract

Godwoken uses block numbers to determine the finality of the layer-2 blocks. The finality mechanism suffered from several imperfections when we considered the challenge. In this article, we discuss possible approaches to correcting the existent finality mechanism, as well as some sub-optimal solutions that could be applied to Godwoken towards reducing the operating costs on the layer-1 chain.

Motivation

In the current implementation, Godwoken determines the finality of a layer-2 block by comparing the block number with a field in GlobalState - last_finalized_block_number - whose value always points to the last finalized block number. If block.number <= last_finalized_block_number, then it will be considered as finalized.

The global state represents the rollup state stored in a layer-1 cell. When Godwoken submits a new layer-2 block, the last_finalized_block_number gets updated accordingly. The consensus constant (in RollupConfig) finality_blocks defines how the last finalized block number is calculated, i.e. last_finalized_block_number = new_block.number - finality_blocks.

However, the downside of this design is that:

The layer-2 block time isn’t constant. Layer-2 block time may be affected by layer-1’s block space capacity. In some scenarios, users may have to wait longer than expected before finalizing their withdrawals.
Impact of block speed on user experience. For a better user experience, we need to accelerate the block speed. But suppose we continue to rely on block numbers for the finality, the finality time will be faster than predicted, and the validator will not likely have time to initiate the challenge.

Thus, we need to correct the finality mechanism. An optimistic rollup security assumption is that at least one validator sent a challenge to revert the invalid state within the challenge time. In terms of the security assumption, a timestamp is a perfect choice for determining the layer 2 block finality.

Overview

Our basic idea is to impose stricter constraints on the layer 2 block timestamps so that the layer 2 block timestamps are reliable and usable for determining finality.

Getting a timestamp on CKB is not that easy. More details can be found here: Off-chain determinism. Just a quick note, the design philosophy requires CKB transactions to be determined off-chain, CKB relies on this attribute to avoid duplicated verification transactions. Given the factors of the network, we won’t know precisely into which block the transaction is packed. And, if we expose the timestamp to the CKB script and use a timestamp to determine validity in the script, then validating the transaction will no longer be determined. The result may be different depending on the timestamp.

With CKB, we have two methods to get the timestamp:

Use the since field. The CKB consensus verifies if the since <= block median time is satisfied, but the result may be inaccurate because a malicious user can use an earlier timestamp (0017-tx-valid-since.md).
Load the CKB block header from an input cell to get the last updated timestamp for that cell. If the cell is left untouched for a long time, the timestamp will lag behind the on-chain time. (0009-vm-syscalls.md)

To get a relatively accurate timestamp, we use the second method by loading the block header through the cell input of a rollup. We also allow the submitter to provide a block header, we then compare the two timestamps and choose the greater one.

We define two variables:

rollup_cell_timestamp - the timestamp rollup from the cell’s header
user_submitted_header_timestamp - the timestamp from the user’s submitted header

We calculate the on_chain_time and use it to constraint layer-2’s block timestamp:


let on_chain_time = max(rollup_cell_timestamp, user_submitted_header_timestamp);

block.timestamp > parent_block.timestamp

block.timestamp < on_chain_time + 2h

block.timestamp > on_chain_time - 2h

As such, the layer-2 block’s timestamp becomes verifiable. To determine the finality of a block:


FINALIZE_TIME_IN_SECS = 604800; // 7 days

fn is_finalized(b: &L2Block, on_chain_time) -> bool {
    b.timestamp + FINALIZE_TIME_IN_SECS <= on_chain_time
}

In the CKB script, we can get the on_chain_timestamp in a similar way that we describe the block timestamp validity. For a non-CKB script environment(off-chain), we can call the CKB RPC get_block_median_time (ckb/rpc at develop · nervosnetwork/ckb · GitHub) to get a timestamp, and use it as on_chain_time.

The finality_blocks consensus constant should be renamed to finality_time_in_secs, and the last_finalized_block_number should be removed from the GlobalState.

We should also support a new on-chain operation. That is, if the Rollup cell has not been updated in 4h, then anyone can touch it. Thus, with the security assumption, the on-chain time we fetch in the scripts is always close to the real on-chain time.

Sub Optimals

Deprecate withdrawal cells

Withdrawal from Godwoken requires two phases:

Generate a withdrawal cell in layer-1
After the finality, users can unlock the withdrawal cell to transfer the money to their addresses

In the initial design of Godwoken, users can sell their assets at a discount when withdrawing. Users could get their money faster if someone were willing to buy the withdrawal cell. But bridges can also implement this kind of feature, so we removed it to reduce the complexity of scripts.

Instead of generating withdrawal cells, we can generate cells for users’ addresses directly.

We add a new field, last_finalized_withdrawal in the global state:


struct LastFinalizedWithdrawal {
    block_number: Uint64,
    withdrawal_index: Uint32,
}

And we add a new on-chain operation, i.e., finalize withdrawals. This operation requires layer-2 blocks with merkle proofs and checks the following constraints:

layer-2 blocks are finalized;
layer-2 blocks and withdrawals must be continuous. All withdrawals must be processed.

Once the constraints are satisfied, we check the withdrawals to remove the assets from layer-2 and check that the corresponding cells are generated to the user’s address.

Reduce on-chain verification cost

Godwoken verifies the account balance for user deposits and withdrawals when submitting a new layer-2 block. At the same time, the Merkle verification of the state costs layer-1 block space and cycles in order to continue verifying these on-chain.

In this regard, we can move the on-chain verification of deposits to the challenge phase to reduce the occupation of the layer-1 resource. In the challenge phase, we execute the whole layer-2 transaction on the layer-1 chain. By comparing the new state root with the corresponding checkpoint in the layer-2 block, we can determine the validity of a layer-2 transaction.

But in practice, on-chain execution of a layer-2 transaction may cost too many resources. To solve this problem, we must implement the interactive challenge - binary search the single step to cause the dispute and only execute one instruction on-chain to solve the conflict. This makes the checkpoints field of the layer-2 block unnecessary, and we can remove it to save 32 * txs_len bytes bytes.

keroro520 · September 16, 2022, 9:22am

The input rollup cell is one of input cells. While there may be another input cell with a more recent timestamp.
Therefore, I think using “the most recent time of input cells” instead of rollup_cell_timestamp is more accurate.

jjy · September 16, 2022, 9:50am

If we use “the most recent time of input cells”, which means we need to load a bunch of CKB headers into memory, it may costs too many resources in the scripts and limit the number of inputs of the transaction.

The input rollup cell should be accurate enough if we implement the following mechanism.

We should also support a new on-chain operation. That is, if the Rollup cell has not been updated in 4h, then anyone can touch it. Thus, with the security assumption, the on-chain time we fetch in the scripts is always close to the real on-chain time.

BTM, we can also load headers from tx’s header_deps field, and find the newest timestamp by compares these with the header load from rollup cell input.

keroro520 · September 16, 2022, 9:58am

Agree.
Enough is enough.

In addition, “the most recent time of input cells” may make it more difficult for the L2 generator when mining new l2transactions if the L2 generator doesn’t know a certain time.

Flouse · October 13, 2022, 11:02am

Godwoken v1.7-rc has a notable change: accelerate the block speed to for a better user experience.

The average block time of Godwoken testnet_v1 is now around 8s.

Please let us know if your DApp experiences issues with this change. Thank you!

Related PR

Decouple block producing, submission and confirming by sopium · Pull Request #776 · godwokenrises/godwoken · GitHub