A Fine-Grained Control Scheme Based on Cobuild-OTX

The current Cobuild-OTX can only determine Input_cells, output_cells, cell_deps, and header_deps to be covered by the hash of OTX, without the ability to control more finely, such as excluding output_data from an output. This limitation constrains the flexibility in designing applications with OTX. Previously, Xuejie designed a more complex OTX RFC: Composable Open Transaction Lock Script that could control the signature coverage range more finely, but it had security issues in some cases, so the plan was changed to the current solution.

However, based on the current Cobuild-OTX scheme, is it possible to make some minor modifications to support fine-grained control?

First, we need to analyze which parts can be controlled more finely. Cell_deps and header_deps do not need this because they only have a fixed single attribute.

For inputs and outputs, however, fine-grained control is possible.

For inputs, there are two attributes to adjust: since, outpoint. This can be expressed with 2 bits. Given that input_cells are the number of inputs, only input_cells * 2 bits are needed to finely control the range covered by each input’s signature in the signature. For example, the first bit represents since, when it is 0, it covers since, and when it is 1, it does not cover since. Similarly, the second bit represents the outpoint. When this flag is set to 11, it effectively applies no constraints to the input. However, since the final signature verification requires unlocking the lock, it implicitly imposes a constraint on the lock.

For Outputs, there are four attributes to adjust: lock, type, capacity, output_data. A set with four elements has 16 subsets, excluding the empty set leaves 15, which can be expressed with 4 bits for each output’s coverage range. Once the mapping relationship is determined, for example, the first bit represents lock, when it is 0, it covers lock, and when it is 1, it does not cover lock. Similarly, the second bit represents type, the third bit represents capacity, and the fourth represents output_data. According to this rule, 0000 represents full signature, 0001 means not signing output_data, 1000 means not signing lock, 1010 means not signing lock and capacity, 1111 means covering nothing, but an output placeholder is needed in the corresponding position. Therefore, only the number of outputs multiplied by 4 bits is needed to achieve fine-grained control, for example, for 20 outputs, only 10 bytes of data are needed to constrain.

We supplement the data structures in CKB Open Transaction (OTX) CoBuild Protocol Overview with the following changes:

// A full CKB Transaction can only contain one OtxStart structure, denoting the
// start for Otxs in the CKB Transaction. The include 4 fields here mark
// the starting indices of input cells, output cells, cell deps and header deps
// that belong to an Otx. There is no need to mark the starting index for Otx
// witnesses, since OtxStart itself already decides the starting index for Otx
// related witnesses.
// Several Otx structure would follow OtxStart structure in the witnesses array,
// each denoting a different Otx packed in current CKB Transaction.
table OtxStart {
    start_input_cell: Uint32,
    start_output_cell: Uint32,
    start_cell_deps: Uint32,
    start_header_deps: Uint32,

table SealPair {
    script_hash: Byte32,
    seal: ByteVec,
vector Seals <SealPair>;

// Otx structure denotes an Otx included in the Transaction, each Otx has exactly
// one Otx structure in the witnesses array. The index of a particular Otx witness
// in the witnesses array, has nothing to do with the indices of input / output
// cells that belong to this particular Otx.
table Otx {
	// Assuming an Otx has 3 input cells, those 3 input cells might choose to use
	// different lock scripts, signed by different signature verfication algorithms.
    // Following the above discussion, an Otx has a single Otx witness structure.
    // We need a way here to keep multiple signatures for different lock scripts
    // in a single witness.
    // This is why seals is designed in the way it is, each SealPair in Seals
    // resembles Action in the CoBuild message definition. We use script_hash
    // included in SealPair to distinguish among different lock scripts. A lock
    // script will first find a SealPair whose script_hash matches its own lock
    // script hash, then exact the seal field from the correct SealPair, which
    // might contain the actual signature.
    seals: Seals,
    // input_cells is an integer number representing the number of input cells
    // belonging to current Otx.
    // output_cells / cell_deps / header_deps function in similar way.
    // input_flags is an byteVec representing which attributes should be covered
    // output_flags function in similar way.
    input_cells: Uint32,
    input_flags: ByteVec,
    output_cells: Uint32,
    output_flags: ByteVec,
    cell_deps: Uint32,
    header_deps: Uint32,
    message: Message,

Supplementary content, for cell_deps and header_deps, we can use 1bit for each item for fine-grained control. When it is 0, it means that the signature covers this attribute. When it is 1, it means that the signature does not cover this attribute and only verifies the placeholder. .


If we want to make the control of otx finer, we can split the output lock and type into code_hash+hash_type, and args. Note that code_hash+hash_type is constrained together.

In this case, compared with the original article, there are six adjustable attributes, lock become lock_code and lock_args, type become type_code and type_args. According to this design, using six bits for each output can make such constraints.


This is indeed a possible variation on the OTX design, but it certainly comes with its own implications:

  • What shall really be the smallest granularity supported here? CellInput has OutPoint and since fields, while OutPoint might make sense to be the smallest entity, since could be further divided into pieces? Do we want to support the case that only epoch-based since value is used? Or does OTX needs to be expressive to explain the fact that any value that is later than 500,000 block can be used? In addition, it is not enough to simply split a script into code_hash, hash_type, and args. Do we need to consider the scenario where any hash_type later than data1? Or does it make sense for an OTX to only cover the first 20 bytes of args? Or [4..8) in args? And let’s go still one step further: does it make sense for an OTX to only define the lock script for a particular input cell? There is a pretty deep rabbit hole waiting for us.
  • Another problem here, is that the security assumption might also change if an OTX only defines certain part of a cell. Assuming an OTX only guards the type script & cell data, what if the OTX agent assembles the OTX using an always-success lock, or a lock that cannot be unlocked? What if someone tries to steal the CKBytes for an OTX that does not lock the capacity part?

Of course we could add all kinds of validation rules, supporting finer granularity, but that would pretty much lead us back to previously tried paths:

One big rationale behind the OTX Cobuild protocol, is that we want to use a complete cell as the smallest entity in OTX apps. Yes I’m fully aware of the implications, and the fact that this might not be as flexible as previous designs, but it will be easier to reason about OTX apps now. Given a full cell, you will know exactly what an OTX want to achieve, the ambiguity will be reduced to a minimum state, one does not need to think about the complications when OTX guards only part of a cell.

I do agree that in later designs, we might end up adding support to finer-grained guards like discussed here. But personally I am not sure if we want to be as expressive as we can in the first version of OTX design. I do feel safer when we are only dealing with complete cells at the moment, at least till we know more about the implications of OTXs in a real-world setting.

The solution to this problem is not complicated. As long as we allow a certain attribute to be left blank, we can use Action to impose any fine constraints on this item, such as executing a code through spawn to check it. Although this spawn usage is not currently supported, we can pass this constraint to the specified type for checking. This code checks that the hash_type of a lock is greater than data_1 or is type. In a similar way, we can implement arbitrarily fine constraints.

The solution to this problem is similar to the previous one. When an application needs to use more fine-grained control, then it needs additional logic to protect the unconstrained parts.

I don’t think all detection mechanisms should be implemented by the protocol. Do it by application itself, as long as the user understands what they signed.

Analyzing the example you mentioned, the user can construct an output that does not contain lock. The application may have a checking mechanism, that is, whoever provides a special zero-knowledge proof can fill in the corresponding output lock as his or her own, which may be useful in some privacy scenarios.

I think what removing something from OTX signature coverage means is that it will now be more finely constrained by the application itself. This is what Action is for.

In fact, it was while thinking about the application design that I encountered the Cell Output Reusing Attack problem mentioned in the Exploring the CKB OTX Paradigm: Accomplishments and Insights from Building a Transaction Streaming Prototype, prompting me to seek a more finer-grained design.

In certain scenarios, the output value is not fixed but is constrained by Action. Under the current design, I can only completely exclude the needed output from signing message to exert control, forcing the application to scan all outputs to find one that meets the requirements. This leaves the possibility of Cell Output Reusing Attack open.

Suppose the OTX’s signing message could leave some attributes of the output uncovered, or even leave the corresponding position blank. In that case, the application could at least determine which OTX the specific output belongs to and which Action it is tied to, thereby making Cell Output Reusing Attack no longer possible.

Let me first confirm one thing: I believe smaller granularity of OTX, where we guard single field than a full cell, is not without its merits. But the past trials of OTX also taught us a valuable lesson: in the quest of CKB, we always aim at flexibility. But the truth is: flexibility has its costs, when a model is too flexible, it might also be hard to make sure it is secure. I do personally believe that you are underestimating the efforts by augmenting an Action to impose finer constraints. We could disagree here of course, what I’m trying to explain here, is that I do want to start simple in the current OTX Cobuild protocol.

If later we understand this protocol better, and believe finer-grained guards provide enough advantages while being completely understood, we can definitely introduce those extra features to cobuild. But shall we do that now? I have my concerns.

In CKB, we now have 3 different levels of flexibility:

  1. A sighash-all signed transaction, where every piece in the transaction is signed. No modification is allowed
  2. An OTX, where certain input / output cells are fully guarded. No modifications to those cells are allowed
  3. An OTX with custom actions, it might guard certain input cells, but allowing arbitrary output cells satisfying the additional rules required in actions

We are now arguing if we need a level 2.5, where part of the output cells are guarded via OTX rules, but other part of the output cells are guarded by rules provided by actions. Personally, I would wait to see the answer to the following 2 questions:

  • Is it really true that level 2 does not provide enough flexibility for certain cases?
  • Is there any rule that is particularly hard to enforce via custom actions?

You mentioned that we can use Action to impose finer constraints on output cells, I personally believe this is totally doable, and should also be the answer when you only need constraints on part of output cells.

Note in previous discussions, I mentioned that I have my concerns augmenting an Action to an OTX for further constraints. Let me be clear: I have no problems with OTX Cobuild protocol, where OTX protocol guards full output cells, I also have no problems using custom Actions to guard part of output cells. But I do have a problem mixing OTX Cobuild protocol and custom Actions, where one field might be guarded by OTX protocol, but another field would be validated by custom Actions. Personally I believe this to be a dangerous thing.

What’s more, if you stick strictly to custom Actions validated by a single lock / type script, Cell Output Reusing Attack won’t really be a problem here. The said lock / type script can already ensure different output cells contribute to different OTX.

If a specific Action executed by OTX A and an Action executed by OTX B have overlapping outputs, and they are two different Types, then this attack is still possible because the output is not bound to a specific OTX.

This is true for all pool-based applications, such as Lending/AMM.

To me it depends on how you define the Actions

This is of course a bit dangerous, but it is not a reason not to allow it, because using an all-zero flag means exactly the same as the original OTX mode, so this is an optional extension. Writing Turing-complete scripts is dangerous too.