Ideas on chained locks

xxuejie · June 23, 2021, 4:07pm

When working with CKB, we are exploring into unvisited wonderland. This means the design choice we made might be problematic at times. For example, dynamic linking in CKB turns out to be something that does more harm than good. Hence we made the decision to deprecate it in favor of exec coming in the next hardfork.

That still leaves one remaining question unsolved: sometimes you have multiple different signatures to verify within a single lock script. With dynamic linking, one can load multiple signature verification algorithm in current VM. Exec, on the other hand, would just use the callee script to replace caller script, the callee script would just exit with a return code, without returning to the caller script. How should we validate multiple different signatures using an exec based design?

In this article, we will explore the idea of chained locks.

A chained lock looks like a normal lock script, you can even use it like a normal lock script. Each chained lock script contains validation code for one particular signature algorithm, for example, we might have chained lock scripts for secp256k1, secp256r1, RSA, Schnorr, BLS, etc. The trick here, lies in the 2 different code pathes that you can execute a chained lock script:

Normal Execution

Recall that the entrypoint for a CKB script, is exactly like a Linux program:

int main(int argc, char* argv[]) {
  // actual logic...
}

Normally when CKB executes a lock script, argc will be 0, while argv will be completely empty. In this case, a chained lock script executes exactly like a typically lock script: it runs the sighash_all algorithm to calculate signing message, then extract a signature from the first witness in current script group, then performs the actual signature verification algorithm.

Exec Execution

When a script is loaded and executed via exec syscall, the caller script can provide arguments to the argc/argv interface used by the entrypoint main function. A chained lock script will enter exec execution mode, when argc is not zero.

In exec execution mode, a chained lock script will expect each individual item in argv to take the following form:

<code hash in hex>:<hash type in hex>:<message 1>:<signature 1>:<message 2>:<signature 2>:...:<message n>:<signature n>

Since argv contains null-terminated strings, each component of items included in argv are encoded using hexadecimals. Chained lock script will proceed with the following logic:

For each message/signature pair included in argv[0], the chained lock script would perform its own signature verification logic. If any message/signature pair results in an error state, the lock script will also exit with an error state.
If argc is exactly 1, the chained lock script returns with a success return code.
The chained lock script would locate the cell using code hash and hash type included in argv[1]. It will then remove argv[0] from argvs, then use the remaining arguments to invoke exec syscall using binary provided by the located cell.

Usage Behavior

In this new design, one chained lock script for each individual signature verification algorithm, e.g., secp256k1, Schnorr, RSA, etc., will be deployed to Nervos CKB. A dapp can build an entrypoint lock script first, the entrypoint lock script is in charge of general validation of the transaction, then prepare a series of signing items in the above format to sign. Below is such an example:

┌─────────────────┐
│                 │
│ Entrypoint Lock │
│                 │
└───────┬─────────┘
        │
        │
        │
┌───────▼─────────┐
│                 │
│      RSA        │
│                 │
└───────┬─────────┘
        │
        │
        │
┌───────▼─────────┐
│                 │
│    secp256r1    │
│                 │
└─────────────────┘

As shown in the example, then entrypoint lock first calls into the RSA chained lock using exec, which then calls into secp256r1 chained lock for more verifications. The arguments prepared by the entrypoint lock might look like the following:

<RSA code hash>:<RSA hash type>:<message 1>:<signature 1>
<secp256r1 code hash>:<secp256r1 hash type>:<message 2>:<signature 2>:<message 3>:<signature 3>

Here message/signature pair 1 will be validated via RSA logic, while message/signature pairs 2 & 3 are validated as secp256r1 signatures.

This way, we can decouple locks for dapps from locks for signature verifications. A lock script is also empowered with the ability to support arbitrary, dynamically determined signature verification algorithms.

tannr · June 23, 2021, 8:09pm

Great choice on exec over dynamic loading.

I noticed on Github page linked to above, that there is discussion around “How do I get back to my original script” which you cannot do with exec because exec replaces the caller with the callee.

However, would something like this work (pseudo-code):

Initially executing Script

argv = [_, bool, _]

if argv[1] == true
  exec(otherScript, cellDeps, idx_of_this_script_in_cell_deps);
else
  new_args = argv[2];
  do_something(new_args);
  return 0;

Script that is Exec’d by above script:

source = argv[1]
index = argv[2]

new_value = do_something();

exec(script_at_source_and_index, false, new_value)

This way, script1’s default argv from script’s data structure is true, but when it is re-exec’d, that flag is false.

This is a way to somewhat mimic fork() .

In the case of exec on CKB, does something like Source::GroupOutput behave the same as the original? I assume not since everything is replaced. So, calling back into the original script with a second exec would not enable the script to load_cell(source::GroupInput) since such a call depends on the script hash.

xxuejie · June 24, 2021, 2:43am

What you described here is totally doable. In fact, if we look at the defined structure above:

<code hash in hex>:<hash type in hex>:<message 1>:<signature 1>:<message 2>:<signature 2>:...:<message n>:<signature n>

It’s totally possible to further generalize from this structure:

<code hash in hex>:<hash type in hex>:<arg 1>:<arg 2>:<arg 3>:...:<arg n>

Here we are not limiting ourselves to message/signature pair, enabling more potential chained locks:

Multi-sign locks, where we would have multiple signatures but only one message
Data validation locks, where the general transaction structure is validated and enforced, no signature verification is performed here.

This way we can achieve what you are envisioning above. However, I do want to raise a different question: CKB uses verification model, meaning the on-chain code only does verification, all scripts running on-chain, essentially just validates that the enclosing transaction confronts to certain structure. Most locks, if not all locks, would first check the transaction for some commonly defined formats, then perform one or more signature verifications. Even though it’s possible to have the following workflow:

Script A is executed, validates logic A1, then use exec to invoke script B
Script B is executed, validates logic B1, then use exec to invoke script A
Script A is executed, validates logic A2

Considering the fact that exec consumes non-neglectable cycles, there will be an economic reason to alter the above logic to the following:

Script A is executed, validates logic A1 and A2, then use exec to invoke script B
Script B is executed, validates logic B

The latter solution achieves exactly the same behavior, without the incurred cycles in the second exec call(and possible initialization code required by script A).

So my real question here is: do we have a use case, where we really need to resume control after invoking a different script using exec?

matt_ckb · June 25, 2021, 6:44pm

Is one design more composable than the other?

xxuejie · June 26, 2021, 1:43am

I think they are the same thing in terms of composability.

WilliamsBlock · July 1, 2021, 7:21am

这里有中文翻译～：【翻译】让 dApp 的 locks 能够验证多种签名算法的新点子： Chained Locks ！