[EN/CN] Script-Sourced Rich Information - 来源于 Script 的富信息

English Version

Background

目前,CKB 上的 Script 通常作为 Verifier 使用。Script 只允许满足规则的交易通过,以此控制 Cell 的状态转移。在这种范式下,Script 本身几乎不对外暴露任何逻辑,Script 的更多信息只能在链下或链上弱绑定的方式约定。

随着 CKB 上的应用方越来越多,问题逐渐浮出水面。这包括 UDT 类资产如何声明自己的代币信息,NFT 类资产如何定义自己的数据结构,怎样解决交易组装的逻辑需要用不同语言重复实现等。我们迫切需要 Script 强绑定的信息/逻辑来解决应用间的互操作性问题。

这时,DOB 协议在链下执行 Decoder 的模式为我们揭示了一条简单有效的路径。

Introduction

来源于 Script 的富信息(Script-Sourced Rich Information, SSRI)方案是对 Script 能力的一次扩展。作为最显著的能力,Script 能在链下执行并返回结果,让开发者可以将任意的信息/逻辑嵌入到 Script 中,向链下应用描述自己的行为和管理的数据。

Design

SSRI 方案由两部分构成:

  • SSRI 在链上定义了一套 Script 的行为标准。类似于 Rust 中的 trait ,SSRI 让我们得以将行为抽象为一系列的 method 约束,满足约束的 Script 可以被认为具有特定的行为。调用方可以通过指定的方法与 Script 进行交互。
  • SSRI 在链下定义了一套 Script 执行的标准。在链下执行的 Script 可以通过特定的 syscall 获得更多的信息来源来构建输出。

Methods

Method Path

在 SSRI 的通用行为标准中,调用方通过被称为方法路径(Method path)的 64 位(8 字节)指针, 来调用 Script 具体的一个方法。

我们约定,方法路径为方法签名 CKB hash 后的前 8 字节。方法签名的格式为 [<trait>.]<method> ,如 calc_unlock_timeUDT.symbol 。我们强烈建议通用的抽象行为方法都应该带有 trait 前缀以避免冲突。因为方法类型不影响方法路径,所以和 Rust 一样,Script 方法不支持重载。

Distribution

为了实现通过方法路径来控制 Script 的行为,我们注意到 CKB VM 的程序入口定义如下:

fn __ckb_std_main(argc: core::ffi::c_int, argv: *const $crate::env::Arg) -> i8;

和 C 的程序入口一样, argcargv 共同描述了程序的参数。

:bulb: CKB Script 的 Rust 包 ckb-std 提供了 ckb_std::env::argv 方法来访问这些参数。

我们约定,遵守 SSRI 的 Script 应当从 argv[0] 读取方法路径,并且将剩余的参数传递给方法。开发者可以自行决定未定义情况的处理方式,包括未指定方法路径/方法路径长度错误/方法不存在。我们推荐:

  • 若未指定方法路径,Script 可以按原来的 Verifier 逻辑执行。
  • 若方法路径长度错误或方法不存在,Script 应当以任意错误码退出。

Off-Chain Syscalls

链下 Script 依然在 CKB VM 中执行,此处列举了目前链下可用的 Syscall 及他们与链上版本的区别,详细的介绍见 VM SyscallsVM Syscalls 2VM Syscalls 3

在链下执行的 Script 可以不依附在任何实体上,这意味着 Script 不一定总能调用全部 syscall。我们将执行环境分为四个等级:

  1. Code 。只有程序代码。
  2. Script 。完整的 Script 结构。
  3. Cell 。完整的 Cell 结构。
  4. Transaction 。完整的 Transaction 结构。

VM Version - Code

目前,如果一个 Script 在链下被执行,VM version syscall 应当返回 -1。

Exit - Code

Set Content - Code

最大输出长度受执行环境决定,不小于 256K。

Debug - Code

Load Script Hash - Script

Load Script - Script

Load Cell - Cell

如果执行等级为 Cellindex 必须为 0,source 必须为 Source::InputGroup

Load Cell By Field - Cell

如果执行等级为 Cellindex 必须为 0,source 必须为 Source::InputGroup

Load Cell Data - Cell

如果执行等级为 Cellindex 必须为 0,source 必须为 Source::InputGroup

Find Out Point By Type - Code

int ckb_find_out_point_by_type(void* addr, size_t* len,
                               const void* type_script, size_t script_len)
{
  return syscall(u64_le_from_bytes(method_path("find_out_point_by_type")), addr, len, type_script, script_len, 0, 0);
}

Script 在链下执行时特有的 Syscall。该 Syscall 会以 type script 在链上搜索一个 Live cell,若存在则将第一个找到的 Cell(注意,这意味着结果可能不总是固定的)对应的 Out point 存入到 addr 中,len 为 36。否则 len 为 0。

Find Cell By Out Point - Code

int ckb_find_cell_by_out_point(void* addr, size_t* len, const void* out_point)
{
  return syscall(u64_le_from_bytes(method_path("find_cell_by_out_point")), addr, len, out_point, 0, 0, 0);
}

Script 在链下执行时特有的 Syscall。该 Syscall 会以 Out point 在链上搜索一个 Live cell,若存在则将其 Cell output 存入到 addr 中。否则 len 为 0。

Find Cell Data By Out Point - Code

int ckb_find_cell_data_by_out_point(void* addr, size_t* len, const void* out_point)
{
  return syscall(u64_le_from_bytes(method_path("find_cell_data_by_out_point")), addr, len, out_point, 0, 0, 0);
}

Script 在链下执行时特有的 Syscall。该 Syscall 会以 Out point 在链上搜索一个 Live cell,若存在则将其 Cell data 存入到 addr 中。否则 len 为 0。

SSRI Trait

所有使用了 SSRI 方案的 Script 都应当实现这些方法,以帮助 Script 被使用。在使用了 SSRI 方案的 Trait 设计中,我们默认使用 Rust 的类型描述参数和返回值:

  • 数字类型均用小端序编码。
  • 布尔类型占用单个字节,若为 0 则为 false,否则为 true。
  • 字符串类型以 utf-8 格式编码,按 Molecule 中的 vector<byte> 类型储存。
  • 复合类型使用 Molecule 编码。

SSRI.version - Code

fn version() -> u8;

目前,该方法总应返回 0。

SSRI.get_methods - Code

fn get_methods(offset: u64, limit: u64) -> Vec<Bytes8>;

用于枚举 Script 所有的方法。输入参数 offsetlimit 用作分页,limit 若为 0 则不限制,返回对应的方法路径数组。

SSRI.has_methods - Code

fn has_methods(methods: Vec<Bytes8>) -> Vec<bool>;

判断一组方法是否存在。参数为目标方法路径数组。

SSRI.get_cell_deps - Code

fn get_cell_deps(offset: u64, limit: u64) -> Vec<CellDep>;

用于枚举 Script 在链上执行时需要的 cell deps (不包含自己)。输入参数 offsetlimit 用作分页,limit 若为 0 则不限制,返回对应的 cell deps 数组。

SSRI - UDT

基于 SSRI 方案,我们可以描述 UDT 类型的 Script 应当具有的行为

10 Likes

Here is the English Version:

Background

Currently, scripts on CKB are usually used as verifiers. Scripts only allow transactions that meet the rules to pass, thereby controlling the state transitions of cells. Under this paradigm, the script itself exposes almost no logic externally, and more information about the script can only be defined upon off-chain or through loosely bound on-chain methods.

As more applications emerge on CKB, issues gradually surface. These include how UDT assets declare their token information, how NFT assets define their data structures, and how to solve the problem of implementing transaction assembly logic in different languages repeatedly. We urgently need strongly bound information/logic in scripts to address interoperability issues between applications.

At this point, the DOB protocol’s off-chain execution of the decoder mode reveals a simple and effective path for us.

Introduction

The Script-Sourced Rich Information (SSRI) solution is an extension of the script’s capabilities. As its most significant feature, the script can execute off-chain and return results, allowing developers to embed arbitrary information/logic into the script, describing its behavior and managing data to off-chain applications.

Design

The SSRI solution consists of two parts:

  • On-chain, SSRI defines a behavior standard for scripts. Similar to traits in Rust, SSRI allows us to abstract behaviors into a series of method constraints. Scripts that meet these constraints can be considered to have specific behaviors. Callers can interact with the script through specified methods.
  • Off-chain, SSRI defines a standard for script execution. Off-chain scripts can obtain more information sources through specific syscalls to construct outputs.

Methods

Method Path

In the general behavior standard of SSRI, callers invoke a specific method of the script through a 64-bit (8-byte) pointer called a method path.

We define that the method path is the first 8 bytes of the method signature’s CKB hash. The format of the method signature is [<trait>.]<method>, such as calc_unlock_time or UDT.symbol. We strongly recommend that common abstract behavior methods should have a trait prefix to avoid conflicts. Since method type parameter does not affect method paths, the Script method does not support overloading, just like in Rust.

Distribution

To achieve controlling script behavior through method paths, we notice that the CKB VM program entry is defined as follows:

fn __ckb_std_main(argc: core::ffi::c_int, argv: *const $crate::env::Arg) -> i8;

Like the C program entry, argc and argv together describe the program parameters.

:bulb: The Rust package for CKB scripts ckb-std provides the ckb_std::env::argv method to access these parameters.

We define that scripts complying with SSRI should read the method path from argv[0] and pass the remaining parameters to the method. Developers can decide how to handle undefined situations, including unspecified method paths, incorrect method path lengths, and non-existent methods. We recommend:

  • If no method path is specified, the script can execute the original verifier logic.
  • If the method path length is incorrect or the method does not exist, the script should exit with any error code.

Off-Chain Syscalls

Off-chain scripts still execute in the CKB VM. Below are currently available syscalls for off-chain execution and their differences from the on-chain version. Detailed introductions can be found in VM Syscalls, VM Syscalls 2, and VM Syscalls 3.

Off-chain scripts do not need to be attached to any entity, meaning scripts do not always call all syscalls. We categorize the execution environment into four levels:

  1. Code: Only the program code.
  2. Script: The complete script structure.
  3. Cell: The complete cell structure.
  4. Transaction: The complete transaction structure.

VM Version - Code

Currently, if a script is executed off-chain, the VM version syscall should return -1.

Exit - Code

Set Content - Code

The maximum output length is determined by the execution environment and is no less than 256K.

Debug - Code

Load Script Hash - Script

Load Script - Script

Load Cell - Cell

If the execution level is Cell, the index must be 0, and the source must be Source::InputGroup.

Load Cell By Field - Cell

If the execution level is Cell, the index must be 0, and the source must be Source::InputGroup.

Load Cell Data - Cell

If the execution level is Cell, the index must be 0, and the source must be Source::InputGroup.

Find Out Point By Type - Code

int ckb_find_out_point_by_type(void* addr, size_t* len,
                               const void* type_script, size_t script_len)
{
  return syscall(u64_le_from_bytes(method_path("find_out_point_by_type")), addr, len, type_script, script_len, 0, 0);
}

This syscall is specifically designed for off-chain script execution. It searches for a live cell with the specified type script on-chain. If found, it stores the out point of the first found cell (note that the result may not always be consistent) in addr, with len being 36. Otherwise, len is 0.

Find Cell By Out Point - Code

int ckb_find_cell_by_out_point(void* addr, size_t* len, const void* out_point)
{
  return syscall(u64_le_from_bytes(method_path("find_cell_by_out_point")), addr, len, out_point, 0, 0, 0);
}

This syscall is specifically designed for off-chain script execution. It searches for a live cell on-chain by out point. If found, it stores the cell output in addr. Otherwise, len is 0.

Find Cell Data By Out Point - Code

int ckb_find_cell_data_by_out_point(void* addr, size_t* len, const void* out_point)
{
  return syscall(u64_le_from_bytes(method_path("find_cell_data_by_out_point")), addr, len, out_point, 0, 0, 0);
}

This syscall is specifically designed for off-chain script execution. It searches for a live cell on-chain by out point. If found, it stores the cell data in addr. Otherwise, len is 0.

SSRI Trait

All scripts using the SSRI solution should implement these methods to facilitate their use. In the SSRI trait design, we default to using Rust’s type descriptions for parameters and return values:

  • Numeric types are encoded in little-endian.
  • Boolean types occupy a single byte, with 0 being false and any other value being true.
  • Strings are encoded in utf-8 and stored as vector<byte> in molecule.
  • Composite types use molecule encoding.

SSRI.version - Code

fn version() -> u8;

Currently, this method should always return 0.

SSRI.get_methods - Code

fn get_methods(offset: u64, limit: u64) -> Vec<Bytes8>;

Used to enumerate all methods of the script. The input parameters offset and limit are for pagination. If limit is 0, there is no restriction. It returns the corresponding method path array.

SSRI.has_methods - Code

fn has_methods(methods: Vec<Bytes8>) -> Vec<bool>;

Determines whether a set of methods exists. The parameter is the target method path array.

SSRI.get_cell_deps - Code

fn get_cell_deps(offset: u64, limit: u64) -> Vec<CellDep>;

Used to enumerate the cell deps required by the script during on-chain execution (excluding itself). The input parameters offset and limit are for pagination. If limit is 0, there is no restriction. It returns the corresponding cell dependencies array.

12 Likes

English Version

SSRI - UDT

基于 SSRI 方案,我们可以描述 UDT 类型的 Script 应当具有的行为。被声明为 UDT 类型的 Script 若不具有以下方法,应按照 xUDT 规范中描述的行为进行解析,通过其它方式补全未知项。

UDT Trait

UDT.balance - Cell

fn balance() -> u128;

获取 Cell 中储存的 UDT 余额。

UDT.name - Script

fn name() -> Bytes;

获取 Script 所代表的 UDT 的名字。

UDT.symbol - Script

fn symbol() -> Bytes;

获取 Script 所代表的 UDT 的符号。

UDT.decimals - Script

fn decimals() -> u8;

获取 Script 所代表的 UDT 在十进制下的最小单位在小数点后几位。

5 Likes

English Version:

Introduction

Based on the SSRI, we can describe the behavior that Script of UDT types should have. If a Script declared as a UDT type does not have the following methods, it should be parsed according to the behavior described in the xUDT specification, and the unknown items should be completed by other methods.

UDT Trait

UDT.balance - Cell


fn balance() -> u128;

Retrieves the UDT balance stored in the Cell.

UDT.name - Script


fn name() -> Bytes;

Retrieves the name of the UDT represented by the Script.

UDT.symbol - Script


fn symbol() -> Byte

Retrieves the symbol of the UDT represented by the Script.

UDT.decimals - Script


fn decimals() -> u8;

Retrieves the number of decimal places for the smallest unit of the UDT represented by the Script.

7 Likes

This tech is a game changer, I really appreciate the direction where this is going. I can’t wait to see the audit of Pausable UDT completed and the revolution that it’ll brings, especially when coupled with Fiber!

Phroi

4 Likes