A Fine-Grained Control Scheme Based on Cobuild-OTX

The current Cobuild-OTX can only determine Input_cells, output_cells, cell_deps, and header_deps to be covered by the hash of OTX, without the ability to control more finely, such as excluding output_data from an output. This limitation constrains the flexibility in designing applications with OTX. Previously, Xuejie designed a more complex OTX RFC: Composable Open Transaction Lock Script that could control the signature coverage range more finely, but it had security issues in some cases, so the plan was changed to the current solution.

However, based on the current Cobuild-OTX scheme, is it possible to make some minor modifications to support fine-grained control?

First, we need to analyze which parts can be controlled more finely. Cell_deps and header_deps do not need this because they only have a fixed single attribute.

For inputs and outputs, however, fine-grained control is possible.

For inputs, there are two attributes to adjust: since, outpoint. This can be expressed with 2 bits. Given that input_cells are the number of inputs, only input_cells * 2 bits are needed to finely control the range covered by each input’s signature in the signature. For example, the first bit represents since, when it is 0, it covers since, and when it is 1, it does not cover since. Similarly, the second bit represents the outpoint. When this flag is set to 11, it effectively applies no constraints to the input. However, since the final signature verification requires unlocking the lock, it implicitly imposes a constraint on the lock.

For Outputs, there are four attributes to adjust: lock, type, capacity, output_data. A set with four elements has 16 subsets, excluding the empty set leaves 15, which can be expressed with 4 bits for each output’s coverage range. Once the mapping relationship is determined, for example, the first bit represents lock, when it is 0, it covers lock, and when it is 1, it does not cover lock. Similarly, the second bit represents type, the third bit represents capacity, and the fourth represents output_data. According to this rule, 0000 represents full signature, 0001 means not signing output_data, 1000 means not signing lock, 1010 means not signing lock and capacity, 1111 means covering nothing, but an output placeholder is needed in the corresponding position. Therefore, only the number of outputs multiplied by 4 bits is needed to achieve fine-grained control, for example, for 20 outputs, only 10 bytes of data are needed to constrain.

We supplement the data structures in CKB Open Transaction (OTX) CoBuild Protocol Overview with the following changes:

// A full CKB Transaction can only contain one OtxStart structure, denoting the
// start for Otxs in the CKB Transaction. The include 4 fields here mark
// the starting indices of input cells, output cells, cell deps and header deps
// that belong to an Otx. There is no need to mark the starting index for Otx
// witnesses, since OtxStart itself already decides the starting index for Otx
// related witnesses.
//
// Several Otx structure would follow OtxStart structure in the witnesses array,
// each denoting a different Otx packed in current CKB Transaction.
table OtxStart {
    start_input_cell: Uint32,
    start_output_cell: Uint32,
    start_cell_deps: Uint32,
    start_header_deps: Uint32,
}

table SealPair {
    script_hash: Byte32,
    seal: ByteVec,
}
vector Seals <SealPair>;

// Otx structure denotes an Otx included in the Transaction, each Otx has exactly
// one Otx structure in the witnesses array. The index of a particular Otx witness
// in the witnesses array, has nothing to do with the indices of input / output
// cells that belong to this particular Otx.
table Otx {
	// Assuming an Otx has 3 input cells, those 3 input cells might choose to use
	// different lock scripts, signed by different signature verfication algorithms.
    // Following the above discussion, an Otx has a single Otx witness structure.
    // We need a way here to keep multiple signatures for different lock scripts
    // in a single witness.
    // This is why seals is designed in the way it is, each SealPair in Seals
    // resembles Action in the CoBuild message definition. We use script_hash
    // included in SealPair to distinguish among different lock scripts. A lock
    // script will first find a SealPair whose script_hash matches its own lock
    // script hash, then exact the seal field from the correct SealPair, which
    // might contain the actual signature.
    seals: Seals,
    // input_cells is an integer number representing the number of input cells
    // belonging to current Otx.
    // output_cells / cell_deps / header_deps function in similar way.
    // input_flags is an byteVec representing which attributes should be covered
    // output_flags function in similar way.
    input_cells: Uint32,
    input_flags: ByteVec,
    output_cells: Uint32,
    output_flags: ByteVec,
    cell_deps: Uint32,
    header_deps: Uint32,
    message: Message,
}

Supplementary content, for cell_deps and header_deps, we can use 1bit for each item for fine-grained control. When it is 0, it means that the signature covers this attribute. When it is 1, it means that the signature does not cover this attribute and only verifies the placeholder. .

4 Likes

目前的 Cobuild-OTX 只能决定 Input_cells, output_cells,cell_deps, header_deps,被 OTX 的哈希所覆盖,而无法更细粒度地控制,如某个 output 不包含 output_data,这样的 OTX 在设计应用的灵活性上受到了约束,之前 xuejie 设计了一个更复杂的 OTX RFC: Composable Open Transaction Lock Script,可以细粒度控制签名覆盖范围,但是在某些情形下会有安全问题,所以方案换成了如今的方案。

但是基于现在的 Cobuild-OTX 方案,有没有可能通过一些小改动,使其支持细粒度控制呢?

首先需要分析,哪些部分可以细粒度控制?cell_deps 和 header_deps 是不需要的,因为这两部分只有固定的单个属性。

而对于 inputs 和 outputs,则可以细粒度控制。

对于 Input,其有 since, outpoint,两个属性可供调整,这可以使用 2bit 来表达,那么已知input_cells是inputs的数量,只需要 inputs_cell * 2 数量的比特,即可细粒度控制每个input在签名里涵盖的范围,例如 第一个 bit 代表 since,当其为 0 时,覆盖since,当其为1时,不覆盖since,同理,第二个 bit 代表outpoint。当这个flag为11时,相当于不对input做任何约束,但由于最后的签名验证需要解锁lock,所以就隐含了对lock的约束。

对于 Output,其有 lock, type, capacity, output_data,四个属性可供调整,一个元素个数为4的集合拥有16个子集,排除掉空集剩15个,可以使用4bit表达每一个output的覆盖范围,只要确定映射关系即可,例如 第一个 bit 代表 lock,当其为 0 时,覆盖lock,当其为1时,不覆盖lock,同理,第二个 bit 代表type,第三个 bit 代表 capacity,第四个代表 output_data。根据这个规则,0000 代表全签名,0001,代表不签名 output_data,1000代表不签名lock,1010 代表不签名 lock 和 capacity,1111 代表什么都不涵盖,但是在对应位置需要有一个output占位。由此,只需要 output 的数量乘以4的bit数,即可做到细粒度控制,例如对于20个output,10byte 数据即可约束。

基于 CKB Open Transaction (OTX) CoBuild Protocol Overview 中的讨论,我们进行如下扩展:

// 一个完整的 CKB Transaction 中只能包含唯一的一个 OtxStart 结构,用于表示当前 CKB Transaction
// 中,连续出现的 Otx 的开始。OtxStart 中标记在 input / output cells 中,以及
// cell / header deps 中,第一个属于 Otx 的结构的位置。这里不需要标记 witness 的开始位置,
// 因为 OtxStart 本身可能就标记了 witness 中第一个属于 Otx 结构的开始位置。
//
// 紧接着 OtxStart 结构,witness 数据中会继续包含若干个 Otx 结构,每一个 Otx 结构都对应着
// 当前 CKB Transaction 中所包含的一个 OTx
//
// 取决于 OtxStart 在 witnesses 数组中具体所处的位置,OtxStart 结构本身,有可能并不被 CKB
// Transaction 中的任何一个签名覆盖。
table OtxStart {
    start_input_cell: Uint32,
    start_output_cell: Uint32,
    start_cell_deps: Uint32,
    start_header_deps: Uint32,
}

table SealPair {
    script_hash: Byte32,
    seal: ByteVec,
}
vector Seals <SealPair>;

// Otx 用于表示某一个确定的 Otx 结构,对于一个 Otx 来说,它有且仅有惟一一个以 Otx 结构呈现
// 的 witness。同时以 Otx 结构表示的 witness 在 CKB Transaction 中所处的位置,与该 Otx
// 实际包含的 input / output cells 在当前 CKB Transaction 中所处的位置之间没有任何确定关系
table Otx {
    // 一个 Otx 可能比如有 3 个 input cells,这 3 个 input cells,并不都一定是相同的 lock
    // 可能他们都是不同的 lock script,用不同的签名算法分别签名。
    // 同时对一个 Otx 来说,只有 Otx 这一个 witness,
    // 我们需要在这一个 witness 中,塞入多个不同 lock script 的不同签名。
    // 因此这里用了 seals,这里 SealPair 的用法跟 Action 类似,
    // 也是先用 script hash 匹配,匹配到再从 SealPair 中取出 Bytes 类型的 lock 字段
    seals: Seals,
    // input_cells 表示当前的 Otx 中,有几个 input cells,是一个数字。 
    // output_cells / cell_deps / header_deps 也类似
    // input_flags 表示当前的 otx 中,每个 input 应被签名覆盖的范围
    // output_flags 也类似
    input_cells: Uint32,
    input_flags: ByteVec,
    output_cells: Uint32,
    output_flags: ByteVec,
    cell_deps: Uint32,
    header_deps: Uint32,
    message: Message,
}

补充内容,对于cell_deps和header_deps,我们可以对每个项使用1bit来细粒度控制,当其为0时,表示签名覆盖该属性,当其为1时,表示签名不覆盖该属性,仅验证占位。

If we want to make the control of otx finer, we can split the output lock and type into code_hash+hash_type, and args. Note that code_hash+hash_type is constrained together.

In this case, compared with the original article, there are six adjustable attributes, lock become lock_code and lock_args, type become type_code and type_args. According to this design, using six bits for each output can make such constraints.

如果我们想让OTX的控制更精细,可以将output的lock和type拆成code_hash+hash_type,以及args,注意code_hash+hash_type是合在一起被约束的,这样的话,相比原文所述,就有了六个可调整的属性,lock变成lock_code和lock_args,type变成type_code和type_args。按照这个设计,对每个output使用六个bit即可完成这样的约束。

1 Like

This is indeed a possible variation on the OTX design, but it certainly comes with its own implications:

  • What shall really be the smallest granularity supported here? CellInput has OutPoint and since fields, while OutPoint might make sense to be the smallest entity, since could be further divided into pieces? Do we want to support the case that only epoch-based since value is used? Or does OTX needs to be expressive to explain the fact that any value that is later than 500,000 block can be used? In addition, it is not enough to simply split a script into code_hash, hash_type, and args. Do we need to consider the scenario where any hash_type later than data1? Or does it make sense for an OTX to only cover the first 20 bytes of args? Or [4..8) in args? And let’s go still one step further: does it make sense for an OTX to only define the lock script for a particular input cell? There is a pretty deep rabbit hole waiting for us.
  • Another problem here, is that the security assumption might also change if an OTX only defines certain part of a cell. Assuming an OTX only guards the type script & cell data, what if the OTX agent assembles the OTX using an always-success lock, or a lock that cannot be unlocked? What if someone tries to steal the CKBytes for an OTX that does not lock the capacity part?

Of course we could add all kinds of validation rules, supporting finer granularity, but that would pretty much lead us back to previously tried paths:

One big rationale behind the OTX Cobuild protocol, is that we want to use a complete cell as the smallest entity in OTX apps. Yes I’m fully aware of the implications, and the fact that this might not be as flexible as previous designs, but it will be easier to reason about OTX apps now. Given a full cell, you will know exactly what an OTX want to achieve, the ambiguity will be reduced to a minimum state, one does not need to think about the complications when OTX guards only part of a cell.

I do agree that in later designs, we might end up adding support to finer-grained guards like discussed here. But personally I am not sure if we want to be as expressive as we can in the first version of OTX design. I do feel safer when we are only dealing with complete cells at the moment, at least till we know more about the implications of OTXs in a real-world setting.

1 Like

The solution to this problem is not complicated. As long as we allow a certain attribute to be left blank, we can use Action to impose any fine constraints on this item, such as executing a code through spawn to check it. Although this spawn usage is not currently supported, we can pass this constraint to the specified type for checking. This code checks that the hash_type of a lock is greater than data_1 or is type. In a similar way, we can implement arbitrarily fine constraints.

The solution to this problem is similar to the previous one. When an application needs to use more fine-grained control, then it needs additional logic to protect the unconstrained parts.

I don’t think all detection mechanisms should be implemented by the protocol. Do it by application itself, as long as the user understands what they signed.

Analyzing the example you mentioned, the user can construct an output that does not contain lock. The application may have a checking mechanism, that is, whoever provides a special zero-knowledge proof can fill in the corresponding output lock as his or her own, which may be useful in some privacy scenarios.

I think what removing something from OTX signature coverage means is that it will now be more finely constrained by the application itself. This is what Action is for.

In fact, it was while thinking about the application design that I encountered the Cell Output Reusing Attack problem mentioned in the Exploring the CKB OTX Paradigm: Accomplishments and Insights from Building a Transaction Streaming Prototype, prompting me to seek a more finer-grained design.

In certain scenarios, the output value is not fixed but is constrained by Action. Under the current design, I can only completely exclude the needed output from signing message to exert control, forcing the application to scan all outputs to find one that meets the requirements. This leaves the possibility of Cell Output Reusing Attack open.

Suppose the OTX’s signing message could leave some attributes of the output uncovered, or even leave the corresponding position blank. In that case, the application could at least determine which OTX the specific output belongs to and which Action it is tied to, thereby making Cell Output Reusing Attack no longer possible.

Let me first confirm one thing: I believe smaller granularity of OTX, where we guard single field than a full cell, is not without its merits. But the past trials of OTX also taught us a valuable lesson: in the quest of CKB, we always aim at flexibility. But the truth is: flexibility has its costs, when a model is too flexible, it might also be hard to make sure it is secure. I do personally believe that you are underestimating the efforts by augmenting an Action to impose finer constraints. We could disagree here of course, what I’m trying to explain here, is that I do want to start simple in the current OTX Cobuild protocol.

If later we understand this protocol better, and believe finer-grained guards provide enough advantages while being completely understood, we can definitely introduce those extra features to cobuild. But shall we do that now? I have my concerns.

In CKB, we now have 3 different levels of flexibility:

  1. A sighash-all signed transaction, where every piece in the transaction is signed. No modification is allowed
  2. An OTX, where certain input / output cells are fully guarded. No modifications to those cells are allowed
  3. An OTX with custom actions, it might guard certain input cells, but allowing arbitrary output cells satisfying the additional rules required in actions

We are now arguing if we need a level 2.5, where part of the output cells are guarded via OTX rules, but other part of the output cells are guarded by rules provided by actions. Personally, I would wait to see the answer to the following 2 questions:

  • Is it really true that level 2 does not provide enough flexibility for certain cases?
  • Is there any rule that is particularly hard to enforce via custom actions?

You mentioned that we can use Action to impose finer constraints on output cells, I personally believe this is totally doable, and should also be the answer when you only need constraints on part of output cells.

Note in previous discussions, I mentioned that I have my concerns augmenting an Action to an OTX for further constraints. Let me be clear: I have no problems with OTX Cobuild protocol, where OTX protocol guards full output cells, I also have no problems using custom Actions to guard part of output cells. But I do have a problem mixing OTX Cobuild protocol and custom Actions, where one field might be guarded by OTX protocol, but another field would be validated by custom Actions. Personally I believe this to be a dangerous thing.

What’s more, if you stick strictly to custom Actions validated by a single lock / type script, Cell Output Reusing Attack won’t really be a problem here. The said lock / type script can already ensure different output cells contribute to different OTX.

If a specific Action executed by OTX A and an Action executed by OTX B have overlapping outputs, and they are two different Types, then this attack is still possible because the output is not bound to a specific OTX.

This is true for all pool-based applications, such as Lending/AMM.

To me it depends on how you define the Actions

This is of course a bit dangerous, but it is not a reason not to allow it, because using an all-zero flag means exactly the same as the original OTX mode, so this is an optional extension. Writing Turing-complete scripts is dangerous too.