从比特币应用编程理解 CKB 的可编程性

因为新帖子在文中的超链接数量的限制,无法将参考文献定位到具体的位置。敬请见谅。

摘要

理解一个系统的可编程性要求我们辨识这个系统在结构上的特征。对基于比特币脚本的应用编程的探索,有助于我们理解 CKB Cell 的基本结构及其编程范式。不仅如此,它还能将 CKB 的编程元件分解为恰当的部分,并帮助我们理解每一部分所带来的可编程性增益。

一. 引言

“可编程性(programmability)” 是人们在比较区块链系统时经常采取的一个维度。然而,关于可编程性的描述方法,却常见分歧。一种常见的表述是,“XX 区块链支持图灵完备的编程语言”,或者, “XX 区块链支持通用的编程”,意在表示这里的 “XX 区块链” 具备强大的可编程性。这些语句的暗示有一些道理:支持图灵完备编程的系统一般都比不支持的更容易编程。但是,智能合约系统的结构性特征有多个方面,这一语句只涉及其中一个方面,因此,不足以凭它获得足够深的理解:开发者从中得不到指引,普通用户也无法凭此分辨诈骗。

智能合约系统在结构上的特征包括:

  • 状态表达(合约)的基本形式(账户 vs. 交易输出)
  • 是否允许编程任意计算(“图灵完备” 的说法关涉的就是这个方面)
  • 执行过程可创造新数据,还是只传出布尔值?(计算 vs. 验证)
  • 是否允许在合约内记录额外的状态
  • 一个合约在执行的时候是否可以访问另一个合约的状态

所以,在 “可否编程任意计算” 之外,至少还有四个方面的特征会影响一个智能合约系统的可编程性。甚至可以说,这些其它方面的特征是更为重要的,因为它们更深层地决定了什么容易实现、什么难以实现;什么是较为经济的实现,而什么是较为低效的实现。

举个例子,人们常常拿以太坊作为良好可编程性的例子,但是,以太坊的状态表达的基本形式是账户,它难以编程点对点的合约(例如,支付通道、一对一的打赌合约) —— 并非绝对不能实现,只是吃力不讨好。以太坊生态并非从未有过尝试实现 支付通道/状态通道 的项目,理论探讨也有很多,但时至今日这些项目似乎都不活跃了 —— 这显然不能归咎于开发者不努力。如今在以太坊上活跃的项目都采取了 “资金池” 的形式,而非 “点对点合约” 的形式,也不是偶然。同样地,当前人们也许对以太坊的可编程性很满意,但是,若要实现 “账户抽象(account abstract)”(也可以理解为钱包概念的泛化) ,账户模型却可以说是先天不足。

同理,探究 CKB 的可编程性,也要求我们理解 CKB 智能合约系统在这些方面的结构特征。我们已经知晓的是,CKB 允许编程任意计算、允许在合约内记录额外的状态、也允许一个合约在执行时访问另一个合约的状态。但是,其合约的形式是交易的输出(称为 “Cell”),这使得它跟以太坊产生了根本性的差异。因此,对以太坊智能合约系统以及其中的合约实例的了解,并不能帮助我们理解 CKB 是如何实现这些结构特性的,也不能帮助我们认识 CKB 的可编程性。

幸运的是,比特币上的智能合约,似乎为我们理解 CKB 的可编程性提供了最好的基础。这不仅是因为比特币的状态表达的基本形式也是交易的输出(称为 “UTXO”),更是因为,借助比特币社区提出的一个概念 “限制条款(covenants)”,我们可以理解 CKB 具备上述结构特性的原因,并将最终的效果恰当地分拆成几个部分、逐一辨识它们所带来的可编程性增益。

二. CKB v.s. BTC:多了什么?

(一)基本结构

作为比特币状态表达的基本形式,比特币的 UTXO(“未花费的交易输出”)有两个字段:

  1. 数额,以聪(Satoshi)为单位,表示该 UTXO 具备的比特币价值;
  2. 脚本公钥,也称锁定脚本,表示花费这笔资金所需满足的条件,也即为解锁这笔资金设定条件的智能合约程序。

与后来出现的智能合约系统相比,比特币脚本是相当受限的:

  • 它不允许编程任意计算;可用来验证的较为实用的计算只有几种(签名检查、哈希原像检查、时间检查)
  • 它不允许在合约内记录额外的状态;(例如,你无法用脚本来限制单次花费的 比例/额度 上限;也无法在其中暗藏一种 token)
  • 它也不允许在执行的时候访问另一个合约的状态(每个脚本都是独立的宇宙,互不依赖)。

这种脚本虽然有限,但并不缺乏编程出让人惊叹的应用的能力,而且也正好是我们探索 CKB 可编程性的基础。后文将有专门的一节来介绍比特币脚本编程的两个例子。

与之相对的,CKB 的状态单元称为 “Cell”,有四个字段:

  1. Capacity,类似于 UTXO 的数额,表达的是该 Cell 可以占据的空间大小,以字节(Bytes)为单位;
  2. Lock Script,类似于 UTXO 的脚本公钥,定义该 Cell 的所有权;只有所提供的数据能够通过 Lock Script 时,才能 “更新” 这个 Cell(也可以说释放这个 Cell 并用这些 Capacity 来铸造新的 Cell);
  3. Data,数据,任意数据,其体积受 Capacity 的限制;
  4. Type Script,可选的脚本,用于为 Data 的更新设定条件。

此外,Lock Script 和 Type Script 还可以编程任意计算。你可以编程出任意的签名验证算法,也可以编程出任意一种哈希算法的原像检查,等等等等。

读者很容易就能看出,Cell 相比 UTXO 在可编程性上的提升:

  • Cell 可以编程任意计算,而不是只能编程特定的几种计算;它的所有权验证会更加灵活;
  • 因为 Data 和 Type Script 字段,Cell 可以记录额外的状态;这就允许 Cell 承载所谓的 “UDT(用户自定义的 Token”)。

结合 Cell 本身的 “交易输出” 结构,这两点本身能带来的好处已然非常非常巨大,但是,仅从上面的描述,我们尚不知晓 Cell 是如何实现 “一个合约在运行时访问另一个合约的状态” 的。为此,我们需要借助比特币社区探讨了很长时间的一个概念:“限制条款(covenants)”。

(二)限制条款与内省

限制条款的本意是限制一笔钱能被花到哪里去。在当前的比特币(尚未部署任何限制条款提议)上,一笔资金一旦解锁,就可以花到任何地方(可以支付给任意的脚本公钥)。但限制条款的想法是,可以用某种方式,限制它只能花到某些地方去,比如,某一个 UTXO 将只能被某一笔交易花费,那么,即便有人能够为这个 UTXO 提供签名,它可以花到什么地方也已经被这笔交易决定了。这种功能看起来有点奇怪,却能产生一些有趣的应用,后文会有专门的一节介绍。重要的是,它是我们进一步理解 CKB 可编程性的关键。

Rusty Russell 正确地指出,限制条款可以理解为对交易的 “内省” 能力,即,当一个 UTXO A 被一笔交易 B 花费时,脚本运算程序可以读取交易 B 的部分(或者全部),然后检查它们是否与脚本预先要求的参数一致。例如,交易 A 的第一个输出的脚本公钥,是否与 UTXO A 的脚本公钥所要求的一致(这就是限制条款的最初含义)。

敏锐的读者会意识到,如果具备了完全的内省能力,那么一个交易的输入就可以读取同一交易的另一个输入的状态,这就实现了我们前面说的 “一个合约在运行时访问另一个合约的状态” 的能力。事实上,CKB Cell 正是这么设计的。

基于此,我们又可以将这种完全的内省能力分成四种情形:

  • Lock Script 读取其它(输入和输出)的 Lock Script
  • Lock Script 读取其它(输入和输出)的 Type Script(以及 Data)
  • Type Script 读取其它(输入和输出)的 Lock Script
  • Type Script 读取其它(输入和输出)的 Type Script(以及 Data)

这就允许我们在一定的假设(Lock Script 和 Type Script 的功能分工)之下分析每一部分的内省能力在不同应用场景中的作用,也即分析每一部分为我们带来的可编程性增益。

在下面的两个章节,我们将分别了解当前(尚未限制条款提议)的比特币脚本编程,以及限制条款提议有望实现功能,从而具体地理解 CKB Cell 如何编程并做得更好。

三. 比特币脚本编程

本节将使用 “闪电通道/闪电网络” 和 “谨慎日志合约(DLC)” 作为基于比特币脚本的应用编程的案例。在展开之前,我们要先了解两个概念。

(一)OP_IF 以及 “承诺交易”

第一个概念是比特币脚本中的流程控制操作码,比如:OP_IFOP_ELSE。这些操作码跟计算机编程中的 IF 没有什么区别,它的作用就是根据不同的输入执行不同的的语句。在比特币脚本的语境下,这意味着我们可以设置资金的多个解锁路径;搭配时间锁特性,这意味着我们可以分配行动的优先权。

以著名的 “哈希时间锁合约(HTLC)” 为例,这种脚本翻译成大白话就是:

要么,Bob 可以揭晓某个哈希值 H 背后的原像,再给出自己的签名,即可花费这笔资金
要么,Alice 可以在一段时间 T 过后,凭借自己的签名花费这笔资金

这种 “要么 …… 要么 ……” 的效果,就是通过流程控制操作码实现的。

HTLC 最突出的优点是它可以将多个操作捆绑在一起、实现原子化。例如,Alice 希望跟 Bob 以 BTC 交换 CKB,那么,Bob 可以先给出一个哈希值,并在 Nervos Network 上创造一个 HTLC;然后 Alice 在比特币上创造一个使用相同哈希值的 HTLC。要么,Bob 在比特币上拿走 Alice 支付的 BTC,同时也揭晓原像,从而允许 Alice 在 Nervos Network 上取走 CKB。要么,Bob 不揭晓原像,两个合约都过期,Alice 和 Bob 都可以取回自己投入的资金。

在 Taproot 软分叉激活之后,这种多解锁路径的特性因为 MAST(默克尔抽象语法树) 的引入而得到进一步的强化:我们可以将一条解锁路径变成默克尔树上的一个叶子,每个叶子都是独立的,因此不再需要使用这样的流程控制操作码;而且,因为揭晓一条路径时无需曝光其它路径,我们可以为一个输出加入更多数量的解锁路径,而不必担心经济性问题。

第二个概念是 “承诺交易”。承诺交易的想法是,在一些情况下,一笔有效的比特币交易,即使它不得到区块链的确认,也同样是真实的,有约束力的。

例如,Alice 和 Bob 共同拥有一个 UTXO,这个 UTXO 需要他们两人的签名才能花费。这时候,Alice 构造一笔交易来花费它,将其中 60% 的价值转移给 Bob,剩下的价值转移给自己;Alice 为这笔交易提供自己的签名,然后发送给 Bob。那么,对 Bob 来说,不必将这笔交易广播到比特币网络中,也不必让这笔交易得到区块链的确认,这笔交易的支付效果也是真实的,可信的。因为 Alice 无法独自花费这个 UTXO(因此无法重复花费),也因为 Alice 所提供的签名是有效的,Bob 随时可以加上自己的签名,然后广播该交易,从而兑现这笔支付。也即,Alice 通过这笔有效的(不上链的)交易,给 Bob 提供了一个 “可信的承诺”。

承诺交易是比特币应用编程的核心概念。如前所述,比特币的合约是基于验证的、无状态的、不允许交叉访问的;但是,如果合约不携带状态,那这些状态存放在哪里、合约如何安全推进(变更状态)?承诺交易给出了直截了当的答案:合约的状态可以用交易的形式来表达,从而,合约的参与者可以自己保存状态,而不必将它们展现在区块链上;合约的状态变更问题,也可以转化成如何安全地更新承诺交易的问题;此外,如果我们担心进入一个合约会有危险(例如,进入一个要求双方都签名才能花费的合约,会面临对方不响应从而卡死的风险),那么,只需提前生成花费该合约的承诺交易并获得签名,就可以化解风险、消除对其他参与者的信任。

(二)闪电通道与闪电网络

闪电通道是一种一对一的合约,在这种合约中,双方可以无限次地相互支付,而不必让任何一次支付获得区块链的确认。如你所料,它用到了承诺交易。

在解释 “承诺交易” 的部分,我们已经介绍了一种支付通道。但是,这种仅利用 2-of-2 多签名的合约仅能实现单向支付。即,要么一直是 Alice 向 Bob 支付,要么一直是 Bob 向 Alice 支付,直至用尽自己在合约中的余额。如果是双向支付,那就意味着,在某一次状态更新之后,一方的余额可能变得比以前更少,但是,TA 却拥有对方签过名的上一笔承诺交易 —— 有什么办法阻止 TA 广播旧的这笔承诺交易、让 TA 只能广播最新一笔承诺交易呢?

闪电通道解决这个问题的办法叫做 “LN-Penalty”。现在,假设 Alice 和 Bob 在一条通道中各拥有 5 BTC;现在 Alice 要给 Bob 支付 1 BTC ,于是签名这样一笔承诺交易,并发送给 Bob:

输入 #0,10 BTC:
	Alie-Bob 2-of-2 多签名输出(即通道合约)
	
输出 #0,4 BTC:
	Alice 单签名
	
输出 #1,6 BTC:
	要么
		Alice-Bob 联合临时公钥 #1 单签名
	要么
		T1 时间锁,Bob 单签名

Bob 也签名一笔(跟上述交易恰成对应的)承诺交易,并发送给 Alice:

输入 #0,10 BTC:
	Alie-Bob 2-of-2 多签名输出(即通道合约)
	
输出 #0,6 BTC:
	Bob 单签名
	
输出 #1,4 BTC:
	要么
		Bob-Alice 联合临时公钥 #1 单签名
	要么
		T1 时间锁,Alice 单签名

这里的诀窍,就在于这个 “联合临时公钥”,它是使用己方的一个公钥和对方提供的一个公钥生成的,例如,Alice-Bob 联合临时公钥,是 Alice 使用自己的一个公钥,和 Bob 提供的一个公钥,各自乘以一个哈希值再相加,得出来的。这样一个公钥,在生成出来的时候,是谁也不知道其私钥的。但是,如果 Bob 把自己所提供的公钥的私钥告诉了 Alice,Alice 就可以计算出这个联合临时公钥的私钥。—— 这就是闪电通道可以 “撤销” 旧状态的关键。

在下一次双方要更新通道状态(发起支付)时,双方就交换上一轮中交给对方的临时公钥的私钥。如此一来,参与者就再也不敢广播自己得到的上一笔承诺交易:这笔承诺交易为己方分配价值的输出有两个路径,而临时公钥路径的私钥已被对方知道;所以一旦广播旧的承诺交易,对方就可以立即动用这个联合临时私钥,从而将这个输出中的资金全部拿走。 —— 这就是 “LN-Penalty” 的含义。

具体来说,交互的顺序是:发起支付的一方先向对方请求新的临时公钥,然后构造一笔新的承诺交易并交给对方;得到了承诺交易的一方向对方曝光自己在上一轮给出的临时公钥的私钥。这样的交互顺序保证了参与者总是先得到新的承诺交易,然后才作废自己在上一轮中得到的承诺交易,因此是免信任的。

综上,闪电通道的关键设计有:

  1. 双方总是使用承诺交易来表达合约内部的状态,并以数额的变化来表示支付;
  2. 承诺交易总是花费同一个输入(需要双方同时提供签名的输入),因此所有承诺交易都是相互竞争的,最终只有一笔能够得到区块链的确认;
  3. 两个参与者签名的并不是同一笔承诺交易(虽然它们是成对的);而他们所签名的总是对自己更有利的交易,换句话说,参与者收到的承诺交易,总是对自己不利的;
  4. 这种不利体现在,为自己分配价值的输出带有两个解锁路径:一条路径可以凭自己的签名解锁,却需要等待一段时间;而另一条路径则用到了对方的公钥,仅当自己的一个临时私钥不暴露,才受到保护;
  5. 在每一次支付中,双方都以新的一笔承诺交易来交换对方在上一轮使用的临时私钥,从而,交出了临时私钥的一方就不再敢广播旧的一笔承诺交易,因此,就 “撤销” 了上一笔承诺交易、更新了合约的状态;(实际上,这些承诺交易都是有效的交易,都是可以广播到区块链上的,只是参与者迫于惩罚,不敢再广播了)
  6. 任何一方随时都可以拿对方签过名的承诺交易退出合约;但是,如果双方愿意合作,他们可以签名一笔新的交易,让双方都可以立即拿回属于自己的钱。

最后,因为承诺交易中也可置入 HTLC,所以,闪电通道也可以转发支付。假定 Alice 可以找出一条由闪电通道前后相接所组成的路径、触达 Daniel,那么无需跟 Daniel 开设通道就可以实现免信任的多跳支付。这便是闪电网络:

Alice -- HTLC --> Bob -- HTLC --> Carol -- HTLC --> Daniel

Alice <-- 原像 -- Bob <-- 原像 -- Carol  <-- 原像 -- Daniel

当 Alice 找出了这样的路径并希望给 Daniel 支付时,她向 Daniel 请求一个哈希值,据以构造一个 HTLC 给 Bob,并提示 Bob 给 Carol 转发一条消息并提供相同的 HTLC;消息中又提示 Carol 给 Daniel 转发消息并提供相同的 HTLC。当消息传到 Daniel 手上时,他向 Carol 揭示原像,从而获得 HTLC 的价值、更新合约状态;Carol 也如法炮制,获得 Bob 的支付并更新通道状态;最后,Bob 向 Alice 揭示原像、更新状态。由于 HTLC 的特性,这一连串的支付要么一起成功,要么一起失败,因此,它是免信任的。

闪电网络是由一条又一条的通道组成的,每一条通道(合约)都是相互独立的。这意味着 Alice 只需知晓自己跟 Bob 的通道内发生的事情,而不必理会其他人的通道中发生了多少次交互,也不必理会这些交互使用了哪一种货币,甚至不必知晓他们是不是真的利用了通道)。

闪电网络的可扩展性不仅体现在一条闪电通道内部的支付速度仅受限于双方的硬件资源投入,还在于,由于状态的分散存储,单体节点可以用最低的成本撬动最大的杠杆。

(三)谨慎日志合约

谨慎日志合约(DLC)使用了一种叫做 “适配器签名(adaptor signature)” 的密码学技巧,使得比特币脚本可以编程出依赖于外部事件的金融合约。

适配器签名可以让一个签名仅在加入一个私钥之后,才会变成有效的签名。以 Schnorr 签名为例,Schnorr 签名的标准形式是 (R, s),其中:

R = r.G  
# 签名所用 nonce 值 r 乘以椭圆曲线生成点,也可以说是 r 的公钥

s = r + Hash(R || m || P) * p 
# p 即为签名私钥,P 为公钥


验证签名即验证 s.G = r.G + Hash(R || m || P) * p.G = R + Hash(R || m || P) * PK

假设我给出了一对数据 (R, s'),其中:

R = R1 + R2 = r1.G + r2.G

s' = r1 + Hash(R || m || P) * p 

显然,这并不是一个有效的 Schnorr 签名,它无法通过验签公式,但是,我却可以向验证者证明,只需 TA 知道 R2 的私钥 r2 ,就可以让它变成一个有效的签名:

s'.G + R2 = R1 + Hash(R || m || P) * P + R2 = R + Hash(R || m || P) * P

适配器签名让一个签名的有效性依赖于一个秘密数据,并且是可验证的。但是,这跟金融合约有什么关系呢?

假定 Alice 和 Bob 希望打赌一场球赛的结果。Alice 和 Bob 分别赌绿魔和阿林纳胜出,赌注是 1 BTC。并且,球评网站 Carol 承诺会在球赛结果揭晓时,用一个 nonce R_c 发布对结果的签名 s_c_i

可以看出,一共有三种可能的结果(因此 Carol 的签名有 3 种可能):

- 绿魔胜出,Alice 赢得 1 BTC
- 阿林纳胜出,Bob 赢得 1 BTC
- 平局,两人的资金原路返回

为此,两人为每一种结果创建一笔承诺交易。例如,他们为第一种结果创建的承诺交易是这样的:

输入 #0,2 BTC:
	Alie-Bob 2-of-2 多签名输出(即打赌合约)
	
输出 #0,2 BTC:
	Alice 单签名

但是,Alice 和 Bob 为这笔交易创建的签名却不是 (R, s),而是适配器签名 (R, s');也即,双方交给对方的签名都是不能直接用来解锁这个合约的,而必须揭晓一个秘密值才可以。这个秘密值,正是 s_c_1.G 的原像,也即 Carol 的签名!因为 Carol 的签名 nonce 值已经确定了(是 R_c),所以,s_c_1.G 是可以构造出来的(s_c_1.G = R_c + Hash(R_c || '绿魔胜出' || PK_c) * PK_c)。

当结果揭晓的时候,假定绿魔胜出,Carol 就会发布签名 (R_c, s_c_1),那么无论 Alice 还是 Bob,都可以补完对手的适配器签名,再加上自己的签名,使上述交易成为一笔有效交易,并广播到网络中、触发结算效果。但如果绿魔没有胜出,Carol 就不会发布 s_c_1,这笔承诺交易也就不可能成为一笔有效交易。

以此类推,另外两笔交易也是如此。就这样,Alice 和 Bob 让这个合约的执行依赖于外部事件(准确来说是依赖于断言机对外部事件的播报,其形式是个签名),而且不需要信任对手方。大大小小的金融合约,比如期货、期权,都可以用这种方式来实现。

与其它形式的实现相比,谨慎日志合约最大的特点在于其隐私性:(1)Alice 和 Bob 不需要告知 Carol 自己正在使用 Carol 的数据,这完全不影响合约的执行;(2)链上观察者(也包括 Carol 在内),也无法通过 Alice 和 Bob 的合约执行交易来判定他们正在使用哪个网站的服务,甚至无法断定他们的合约是一个打赌合约(而不是一个闪电通道)。

四. 限制条款应用简介

(一)OP_CTV 与拥堵控制

比特币社区的开发者曾提出过多种可被归类为限制条款的提议。从当前来看,最著名的一个提议当属 OP_CHECKTEMPLATEVERIFY(OP_CTV),其概念较为简单,却保留了相当大的灵活性,因此受到崇尚简洁的比特币社区的欢迎。OP_CTV 的想法是,在脚本中承诺一个哈希值,以约束这笔资金只能被这个哈希值所表示的的交易花费;这个哈希值承诺了交易的输出以及大部分字段,但不承诺交易的输入,只承诺输入的数量。

“拥堵控制” 是一个可以体现 OP_CTV 特性的好例子。其基本应用场景是帮助大量的用户从交易所(一个需要信任的环境)退出到一个资金池中;由于这个资金池使用 OP_CTV 规划了未来的花费方式,因此它可以保证用户可以免信任地退出这个资金池,不需要任何人的帮助;又因为这个资金池只表现为一个 UTXO,它避免了在链上交易需求高涨时支付大量手续费(从 n 个输出减少到了 1 个输出;也从 n 笔交易减少到了 1 个交易)。池内用户可以伺机再从池中退出。

假设 Alice、Bob、Carol 分别想从交易所中取出 5 BTC、3 BTC 和 2 BTC。那么交易所可以制作一个带有 3 个 OP_CTV 分支的、数额为 10 BTC 的输出。假设 Alice 想要取款,她可以动用分支 1;该分支的 OP_CTV 所用的哈希值所代表的交易,将形成两个输出,一个输出是为 Alice 分配 5 BTC;另一个输出又是一个资金池,也使用 OP_CTV 承诺一笔交易,只允许 Bob 取出 3 BTC,并将剩下的 2 BTC 发送给 Carol。

Bob 或者 Carol 想要取款,也是同理。他们在取款时,将只能使用能够通过相应 OP_CTV 检查的交易,也即只能给自己支付相应的数额,而不能任意取款;剩余的资金将又进入一个使用 OP_CTV 锁的资金池,从而保证无论用户的取款顺序如何,剩余的用户都能免信任地从池中退出。

抽象地说,OP_CTV 在这里的作用是为合约规划走向合约生命终结的路径,使得这里的资金池合约不论走哪一条路径、走到了哪个状态,都能保持免信任退出的属性。

这种 OP_CTV 还有一种非常有趣的用法:“隐而不发的单向支付通道”。假设 Alice 形成了这样一个资金池,并保证资金可以免信任地退出到一个带有如下脚本的输出中:

要么, Alice 和 Bob 一起花费它
要么,一段时间后,Alice 可以独自花费它

如果 Alice 不向 Bob 揭晓,Bob 就不会知道有这样的输出存在;一旦 Alice 向 Bob 揭晓,Bob 就可以把这个输出当成一个有时效性的单向支付通道,Alice 可以立即用其中的资金给 Bob 支付,而不必等待区块链的确认。Bob 只需在 Alice 可以独自花费它之前,让 Alice 给他的承诺交易上链即可。

(二)OP_Vault 与保险柜

OP_VAULT 是一种专为构造 “保险柜合约(vaults)” 而提出的限制条款提议。

保险柜合约旨在成为一种更安全、更高级的自主保管形式。当前的多签名合约虽然能免去单个私钥的单点故障,但如果攻击者真的获得了阈值数量的私钥,钱包的主人是无计可施的。保险柜希望能为资金施加单次花费的限额;同时,使用常规路径从中取款时,取款操作将强制执行一个等待期;而在等待期内,取款操作可以被紧急恢复钱包的操作打断。这样的合约,即使被攻破,钱包的主人也可以(使用紧急复原分支)发起反制操作。

理论上,OP_CTV 也可以编程出这样的合约,但却有许多的不便利,其中之一是手续费:在承诺交易的同时,它也承诺了该交易将支付的手续费。考虑到这种合约的用途,设置合约和取款的时间间隔必定很长,那就几乎不可能预测出合适的手续费。尽管 OP_CTV 没有限制输入,因此可以通过增加输入来增加手续费,但所提供的输入将全部变成手续费,因此是不现实的;另一种方式是 CPFP,也即通过花费取出的资金,在新的交易中提供手续费。此外,使用了 OP_CTV 也意味着这样的保险柜合约无法批量取款(当然也无法批量复原)。

OP_VAULT 提议则尝试通过提出新的操作码(OP_VAULTOP_UNVAULT)来解决这些问题。OP_UNVAULT 是专为批量复原而设计的,我们暂时不提。OP_VAULT 的动作则是这样的:当我们把它放在脚本树的一个分支上时,它可以用来承诺一个可以动用的操作码(例如 OP_CTV)而不设具体的参数;在花费这个分支时,交易可以传入具体的参数,但不能更改其它分支。由此,它不必预设手续费,可以在花费这个分支时再设定手续费;假定这个分支也带有时间锁,那么它就会强制执行一个时间锁;最后,因为只可改变自身所在的分支,新的脚本树上的其它分支(包括紧急复原分支)不会被改变,因此允许我们打断这样的取款操作。

此外,还有两点值得一提:(1)OP_VAULT 操作码的动作类似于另一个限制条款提议:OP_TLUV ;Jeremy Rubin 正确地指出,这在一定程度上已经产生了 “计算” 的概念:OP_TLUV/OP_VAULT 先承诺了一个操作码,以允许使用者通过新的一笔交易为该操作码传入参数,从而更新整个脚本树叶子;这就已经不是 “根据一定的条件验证传入的数据” 了,而是 “根据传入的数据产生新的有意义的数据” 了,虽然它可以启用的计算是比较有限的。

(2)完整的 OP_VAULT 提议也利用了一些交易池策略(mempool policy)上的提议(比如 v3 格式的交易)以取得更好的效果。这提醒了我们 “编程” 的含义可以比我们想象的更为广泛。(一个相似的例子是 Nervos Network 里面的 Open Transaction。)

五. 理解 CKB

在上述两个章节中,我们介绍了在一种更为受限的结构(Bitcoin UTXO)上,我们如何用脚本编程出有趣的应用;也介绍了尝试为这种结构加入内省能力的提议。

UTXO 虽然不乏编程出这些应用的能力,但读者也很容易觉察出它们的缺点,或者说可以优化的地方,比如:

  • 在 LN-Penalty 中,通道的参与者必须保存过往的每一笔承诺交易以及相应的惩罚秘密值,以应对对手的欺诈,这构成了存储上的负担。如果有一种机制,可以确保只有最新的一笔承诺交易才会生效,而旧的承诺交易无法生效,那就可以免去这种负担,而且,也可以消除节点因为故障而误将较旧的承诺交易上链,因此被意外惩罚的问题。
  • 在 DLC 中,假设事件的可能结果有很多,双方要提前生成、交给对方的签名也便有很多,这也是一种巨大的负担;此外,DLC 合约的收益是直接绑定在公钥上的,因此其仓位是不便于转移的,有没有办法可以转移合约的仓位呢?

实际上,比特币社区已经为这些问题提出了答案,基本上跟一种 Sighash 提议(BIP-118 AnyPrevOut)有关。

但是,如果我们是在 CKB 上编程,BIP-118 等于是现在就可用了(可以用内省和针对性验证签名的能力模拟出这种 Sighash 标签)。

通过学习比特币编程,我们不仅知道了 “交易输出” 这种格式下可以如何编程(CKB 能编程什么),还能知道这些应用的改进方法(如果我们在 CKB 上编程这些应用,可以如何运用 CKB 的能力来改进它们)。对于 CKB 开发者来说,简直可以将基于比特币脚本的编程当成一种学习的教材,甚至是捷径。

下面,我们逐一分析 CKB 编程的各个模块的可编程性。我们先不考虑内省能力。

(一)可编程任意计算的 Lock Script

如上所述,UTXO 是不能编程任意计算的。而 Lock Script 可以,这就意味着 Lock Script 可以编程出(限制条款部署前)基于 UTXO 编程的所有东西,包括但不限于上文所述的闪电通道和 DLC。

此外,这种可验证任意计算的能力,还使得 Lock Script 可以动用的身份验证手段比 UTXO 更多,更灵活。比如说,我们可以在 CKB 上实现一种一方使用 ECDSA 签名、另一方使用 RSA 签名的闪电通道。

实际上,这正是人们在 CKB 上最早开始探索的领域之一:将这种灵活的身份验证能力用在用户的自主保管中,从而实现所谓的 “账户抽象” —— 交易有效性的授权和控制权的恢复都非常灵活,几乎没有限制。原理上,这就是 “多种花费分支” 以及 “任意身份验证手段” 的结合。实现的例子有:joyid wallet、UniPass。

此外,Lock Script 也可以实现 eltoo 提议,从而实现只需保留最新一笔承诺交易的闪电通道(实际上,eltoo 可以简化一切点对点合约)。

(二)可编程任意计算的 Type Script

如上所述,Type Script 的一大用途是编程 UDT。结合 Lock Script,这意味着我们可以实现以 UDT 为标的的闪电通道(以及其它类型的合约)。

实际上,Lock Script 和 Type Script 的分割可以视为一种安全性升级:Lock Script 专注于实现保管方法或者合约式协议,而 Type Script 专注于 UDT 的定义。

此外,基于 UDT 的定义启动检查的能力,还使得 UDT 能够以跟 CKB 类似的方式参与合约(UDT is first-class citizen)。

举个例子:笔者曾经提出过一种在比特币上实现免信任 NFT 担保借贷的协议。这种协议的关键是一种承诺交易,其输入的价值是小于输出的价值的(因此它还不算是一笔有效的交易),但是,一旦能够为这笔交易提供足额的输入,它就是一笔有效的交易:一旦贷款人能够还款,放贷者就不能将质押的 NFT 据为己有。但是,这个承诺交易的免信任性基于交易对输入和输出的数额的检查,所以贷款人只能使用比特币来还款 —— 即使贷款人和放贷者都愿意接受另一种货币(比如以 RGB 协议发行的 USDT),比特币的承诺交易也无法保证只要贷款人归还了足额的 USDT 就能拿回自己的 NFT,因为比特币交易根本不知道 USDT 的状态!(修订:换言之,无法构造出以 USDT 还款为条件的承诺交易。)

如果我们能够根据 UDT 的定义发起检查,将可以让放贷者签名另一笔承诺交易,允许贷款人使用 USDT 来还款。交易将检查输入的 USDT 数量和输出的 USDT 数量,从而为用户使用 USDT 还款赋予免信任性。

修订:假定这里用作抵押的 NFT 和用于还款的 token 是使用同一套协议(比如 RGB)发行的,那么,这里的问题是能够解决的,我们可以根据 RGB 协议构造一种承诺交易,使得 NFT 的状态转换和还款可以同步发生(在 RGB 协议内用交易绑定两个状态转换)。但是,因为 RGB 的交易也要依赖于比特币交易,这里的承诺交易的构造会有一定的难度。总而言之,尽管问题可以解决,但做不到 token is first-class citizen。

接下来我们再考虑内省能力。

(三)Lock Script 读取其它 Lock Scripts

这意味着限制条款提议实施之后,比特币 UTXO 上的全部编程可能性。包括上文提到的保险柜合约,以及基于 OP_CTV 的应用(比如拥堵控制)。

XueJie 曾经提过一个非常有趣的例子:你可以在 CKB 上实现一种收款账户 Cell,在使用这种 Cell 作为交易的输入时,如果它输出的 Cell (使用相同 Lock Script 的 Cell)具备更多的 Capacity,那么这个输入无需提供签名也不会影响交易的有效性。实际上,如果没有内省的能力,这种 Cell 是无法实现的。这种收款账户 Cell 非常适合作为机构的收款方式,因为它可以将资金归集起来,缺点是它的隐私性不佳。

(四)Lock Script 读取其它 Type Scripts(以及 Data)

这种能力的一个有趣的应用是股权 Token。Lock Script 将根据其它输入中的 Token 的数量来决定能否动用自身的 Capacity,以及这些 Capacity 能够花到哪里去(需要内省 Lock Script 的能力)。

(五)Type Script 读取其它 Lock Scripts

不确定,但可以假设有用。例如,可以在 Type Script 中检查交易的输入和输出的 Lock Scripts 保持不变。

(六)Type Scirpt 读取其它 Type Scripts(以及 Data)

集换卡?集齐 n 个 token 可以换取更大的一个 token : )

六. 结论

与此前出现的可编程任意计算的智能合约系统(如以太坊)相比,Nervos Network 采取了不同的结构;因此,对以往那些智能合约系统的了解,往往难以成为理解 Nervos Network 的基础。本文从一种比 CKB Cell 更为受限的结构 —— BTC UTXO —— 的应用编程出发,提出了一种理解 CKB Cell 可编程性的方法。并且,运用 “内省” 的概念来理解 Cell 的 “跨合约访问” 的能力,我们可以划分运用内省能力的情形,并为它们确定具体的用途。

修订:

  1. 不考虑 Cell 的交叉访问能力(即内省能力),lock scripts 可以认为是带状态、编程能力已趋极致的 Bitcoin Script,因此单凭这一点就可以编程所有基于 Bitcoin Script 的应用;
  2. 不考虑 Cell 的交叉访问能力(即内省能力),lock scripts 和 type scripts 的区分可以认为是一种安全性升级:它切分了 UDT 的 资产定义 与 保管方法;此外,可暴露状态的 type scripts(以及 Data)实现了 UDT is first-class citizen 的效果。

以上两点意味着一种跟 “BTC + RGB” 相同范式但编程能力更强的东西;

  1. 考虑 Cell 的内省能力,Cell 可以获得比 post-covenants BTC UTXO 更强的编程能力,并实现一些 BTC + RGB 难以实现的东西(因为 BTC 无法阅读 RGB 的状态)

关于这些用途,本文无法提出很多具体的例子,但这是因为笔者对 CKB 的生态缺乏了解的缘故。假以时日,相信人们会在其中投入越来越多的想象力,组合出如今难以想象的应用。

致谢

感谢 Retric,Jan Xie 和 Xue Jie 在文章撰写过程中提供的反馈。当然,文中所有的错误都由我自己负责。

参考文献

https://xuejie.space/2019_07_05_introduction_to_ckb_script_programming_validation_model/

20 Likes

摒弃使用“通用”、“图灵完备”这样的词,具体地讨论可编程系统的优缺点,这篇文章提供了一个很好的例子

3 Likes

深度好文,谢谢 @Ajian 的分享

1 Like

It’s indeed a very valuable article. Perhaps we should translate it into English? Is anyone willing to help me with this?

2 Likes

Hey @Ajian, I gave translating your piece into English a shot. Mind having a look and letting me know if it captures the essence of your original work? Your feedback would be awesome. Thanks!

Understanding the Programmability of CKB through Bitcoin Application Programming

Abstract

Understanding the programmability of a system requires us to identify the structural features of that system. Exploring the application programming based on Bitcoin script helps us understand the fundamental structure of CKB Cells and its programming paradigm. Furthermore, it allows us to decompose the programming components of CKB into appropriate parts and helps us understand the programmability benefits brought by each part.

1. Introduction

“Programmability” is a dimension commonly used when comparing blockchain systems. However, there is often a lack of consensus on how programmability is described. A common expression is, “XX blockchain supports Turing-complete programming languages,” or “XX blockchain supports general-purpose programming,” suggesting that the “XX blockchain” has great programmability. These statements have some truth: systems that support Turing-complete programming are typically easier to program than those that do not. However, the structural characteristics of smart contract systems have multiple aspects, and this statement only addresses one aspect. Therefore, by this statement itself we can’t gain a deep understanding towards programmability. It’s neither a guidance for developers, nor a useful tool to distinguish scams for average users.

The structural features of smart contract systems include:

  • The basic form of state expression (accounts vs. transaction outputs)
  • Whether arbitrary computations can be programmed (“Turing completeness” refers to this aspect)
  • Whether the execution process create new meaningful data or only output boolean values (computation vs. verification)
  • Whether additional state can be recorded within the contract
  • Whether a contract can access the state of another contract during execution

So, apart from the question of “whether arbitrary computations can be programmed”, there are at least four other aspects that affect the programmability of a smart contract system. It can even be argued that these other aspects are more important because they, at a deeper level, determine what is easy to achieve, what is difficult to achieve, what is an economical implementation, and what is an inefficient implementation.

For example, Ethereum is usually seen as a system with great programmability. However, Ethereum’s basic form of state representation is accounts, for which it is challenging to program point-to-point contracts (e.g., payment channels, one-on-one betting contracts) on Ethereum.

While it’s not impossible to implement these on Ethereum, it can be a cumbersome process. There has been multiple attempts to implement payment channels/state channels, as well as extensive theoretical discussion on the topic. However, these projects appear to be less active today, and this cannot be solely attributed to a lack of developer effort. Currently, many active projects on Ethereum adopt the “Pool-based” model rather than the “point-to-point contract” model, and this is not by chance. Likewise, while people may be satisfied with Ethereum’s programmability today, if one were to aim for “account abstraction” ( generalization of wallets), Ethereum’s account model is inherently limited.

To explore the programmability of CKB, we also need to understand the structural features of CKB’s smart contract system in the above aspects. We already know that CKB is able to program arbitrary computations, record additional state within contracts, and allows one contract to access the state of another contract during execution. However, fundamentally different from Ethereum, CKB’s basic form of state representation is transaction outputs, referred to as “Cells.” Therefore, an understanding of Ethereum’s smart contract system and its contract instances would not help us comprehend how CKB achieves these structural features or recognize CKB’s programmability.

Fortunately, smart contracts on Bitcoin seem to provide the best foundation for understanding the programmability of CKB. This is not only because the basic form of state representation on Bitcoin is also the output of transactions (referred to as “UTXO”), but also because, with the concept of “covenants” discussed in the Bitcoin community, we can understand how CKB possesses the aforementioned structural features. We can appropriately break down the final effect into several parts and systematically identify the programmability enhancements each of them brings.

2. CKB v.s. BTC:What’s more?

(I)Basic Structure

As the basic form of state representation in Bitcoin, Bitcoin’s UTXO (Unspent Transaction Output) has two fields:

  1. Amount, measured in satoshis, represents the value of Bitcoin held by the UTXO.
  2. ScriptPubKey, also known as the locking script, represents the conditions to spend this amount, which is the smart contract program that sets the conditions to unlock these satoshis.

Compared to later smart contract systems, the Bitcoin script is quite limited:

  • It does not allow arbitrary computation; it only supports a few practical computations for verification, such as signature, hash preimage, and time checks.

Note: Currently an innovation called “BitVM” challenges our stereotypes. The basic idea is that, as we can break down any arbitrary computations into ‘NAND gates’, which can be done with Bitcoin script, it is possible to produce aggrement towards any arbitrary computations between two parties. However, again, as the setup of this kind of contracts may cost amazing time, computational resources and storage. it should be seen as ‘impractical’, at least now.

  • It does not allow the recording of additional state within the contract (for example, you cannot use a script to limit the maximum amount/spending limit for a single transaction, or embed a token within it).
  • It also does not allow access to another contract’s state during execution (each script exists independently, with no interdependence).

While the Bitcoin script may be limited, it does not lack the ability to create impressive applications. It serves as the foundation for exploring the programmability of CKB, and in later sections I will introduce two examples of Bitcoin script programming.

Comparatively, CKB’s basic form of state representation is called a “Cell” and has four fields:

  1. Capacity, similar to UTXO’s amount, represents the size in bytes that this Cell can occupy.
  2. Lock Script, similar to UTXO’s ScriptPubKey, defines ownership of this Cell. Only when the provided data can pass the Lock Script, then it can “refresh” this Cell ( in other words, release its capacity and use it to forge new Cells).
  3. Data, arbitrary data, with its size limited by Capacity.
  4. Type Script, an optional script used to set conditions for updating Data.

Additionally, Lock Script and Type Script can perform arbitrary computation. You can program any signature verification algorithm or preimage checks for any hash algorithm, and so on.

It’s apparent that Cells offer an enhancement in programmability compared to UTXOs:

  • Cells can perform arbitrary computation, rather than being limited to specific computations, making ownership verification more flexible.
  • Data and Type Script fields, enable Cells to record additional state, allowing Cells to carry so-called “User-Defined Tokens (UDTs).”

Combining with the structure of “transaction output” inherent to Cells, these two points alone provide significant benefits. However, based on the above description, we still don’t know how Cells achieve “a contract can access the state of another contract during execution”. Hence, we need to delve into a concept that the Bitcoin community has been discussing for a long time: “Covenants.”

(II)Covenants & Introspection

The purpose of covenant is to limit where the money can be spent. In current Bitcoin system (which has not yet deployed any proposed covenants), once funds are unlocked, they can be spent anywhere (paid to any ScriptPubKey). However, the idea behind covenants is that they can be used to restrict spending to certain destinations. For example, a specific UTXO (Unspent Transaction Output) can only be spent for a specific transaction, so even if someone can provide a signature for that UTXO, where it can be spent has already been determined by that transaction. This feature may seem a bit strange, but it can lead to some interesting applications, which will be discussed in a dedicated section later. Importantly, it is a key element in further our understanding of CKB’s programmability.

Rusty Russell correctly points out that covenants can be understood as a construction to allow introspection. In other words, when a UTXO A is spent by a transaction B, the script program can read parts (or all) of transaction B and then check if they match the parameters required by the script. For example, the program can verify if the ScriptPubKey of the first output in transaction B matches the one required by the script of UTXO A (which is the basic idea of covenant).

Now you might realize that with full introspection capabilities, a transaction’s input can read the state of another input within the same transaction, which is the feature we mentioned earlier, “a contract can access the state of another contract during execution.” In fact, CKB Cells has been designed to operate in this manner.

Based on this, we can categorize this complete introspection into four scenarios:

  • Lock Script reads other Lock Scripts (both input and output)
  • Lock Script reads other Type Scripts and Data (both input and output)
  • Type Script reads other Lock Scripts (both input and output)
  • Type Script reads other Type Scripts and Data (both input and output)

This allows us to analyze the role of each introspection with different application scenarios assuming the roles of Lock Script and Type Script and understand the programmability enhancement brought by each part.

In the following two sections, we will learn about the current Bitcoin script programming (before the proposed covenants) and the expected features that the covenants could provide. This will help us gain a concrete understanding of how CKB Cells has been programmed to be better.

3. Bitcoin Script Programming

I will use “Lightning Network” and “Discreet Log Contracts (DLC)” as examples of Bitcoin script-based application programming. Before delving into these examples, let’s first explore two key concepts.

(I)OP_IF & “Commitment Transaction”

The first concept is the use of flow control opcodes in Bitcoin scripts, such as OP_IF and OP_ELSE. These opcodes are similar to IF statements in programming, allowing different statements to be executed based on different inputs. In the context of Bitcoin scripts, this means that we can define multiple spending paths for funds, and when combined with the time lock feature, it allows us to allocate priority to different actions.

Taking the well-known “Hash Time-Locked Contract (HTLC)” as our example, in simple terms, this script can be explained as follows:

Either Bob can reveal the pre-image of a specific hash value H and provide his signature to spend the funds,

or Alice can spend the funds with her signature after a period of time T has passed.

This “either … or …” effect is achieved through flow control opcodes.

The most prominent advantage of HTLC is that it can bundle multiple operations together, achieving Atomicity.

For example, if Alice wants to exchange BTC for CKB with Bob, Bob can first provide a hash value and create an HTLC on the Nervos Network. Then, Alice can create an HTLC on the Bitcoin network using the same hash value.

Either Bob claim the BTC that Alice has provided on the Bitcoin network while revealing the pre-image, allowing Alice to take the CKB on the Nervos Network.

Or, Bob doesn’t reveal the pre-image, both contracts expire, both Alice and Bob can retrieve the funds they initially committed.

After the activation of the Taproot soft fork, the feature of multiple spending paths is further enhanced with the introduction of MAST (Merkle Abstract Syntax Tree). We can transform a spending path into a leaf on the Merkle tree, and each leaf is independent. As a result, we no longer need to use flow control opcodes like before. Moreover, since revealing one path doesn’t expose the other paths, we can add more spending paths to an output without worrying about economical concerns.

The second concept is “commitment transactions.” The idea behind commitment transactions is that, in some situations, a valid Bitcoin transaction can be considered real and binding even if it hasn’t received confirmations on the blockchain.

For example, if Alice and Bob jointly own a UTXO that requires signatures from both of them to spend (which we called ‘2-of-2 multisig output’), Alice can construct a transaction to spend it. In this transaction, she can transfer 60% of the value to Bob and transfer the remaining 40% to herself. Alice signs this transaction and sends it to Bob. From Bob’s perspective, he doesn’t need to broadcast this transaction to the Bitcoin network, nor does it require blockchain confirmations. The payment in this transaction is still real and trustworthy. This is because Alice cannot spend this UTXO on her own (therefore no double spending), and once Alice provided a valid signature, Bob can add his own signature at any time and broadcast the transaction to settle the payment. In other words, Alice has provided Bob with a “trusted commitment” through this valid (off-chain) transaction.

Commitment transactions are a fundamental concept in Bitcoin application programming. As previously mentioned, Bitcoin contracts are verification-based, stateless, and do not allow for cross-contract interactions. However, if a contract doesn’t maintain state, where is that state stored? and how does the contract safely progress (change state)? Commitment transactions provide a straightforward answer: a contract’s state can be expressed in the form of transactions, allowing the contract participants to maintain their own state off-chain, without the need to expose it on the blockchain. The issue of state changes in the contract can be transformed into the problem of how to securely update commitment transactions.

Additionally, if there is concern about the potential risks of entering into such contracts (e.g., a contract that requires both parties to sign for spending has the risk of the other party not responding and causing a deadlock), one can simply pre-generate the commitment transaction to spend that contract and obtain the necessary signatures. This can mitigate the risk and eliminate the need for trust in other participants.

(II)Lightning Channels & Lightning Network

A Lightning Channel is a one-to-one contract where both parties can make unlimited off-chain transactions between each other without boardcasting any single payment to the network, as you might expect, it relies on commitment transactions.

In the previous section on “commitment transactions,” we have introduced a type of payment channel. However, this channel, which uses a simple 2-of-2 multi-signature contract, can only facilitate one-way payments. In other words, either Alice pays Bob continuously or Bob pays Alice continuously, until their balances in the contract are exhausted. If the payments are bi-directional, after a state update, one party might have a lower balance compared to the previous state, but they still possess a commitment transaction signed by the other party —— Is there a way to prevent them from broadcasting the old commitment transactions and ensure they can only broadcast the latest commitment transaction? Currently, Lightning Channel solves this problem with a solution called “LN-Penalty.”

Now, let’s assume Alice and Bob each have 5 BTC in a channel. Alice wants to pay 1 BTC to Bob, so she signs a commitment transaction like this and sends it to Bob:

Input #0, 10 BTC:
	Alice-Bob 2-of-2 multi-signature output (i.e., channel contract)

Output #0, 4 BTC:
	Alice single signature

Output #1, 6 BTC:
	Either
		Alice-Bob temporary joint public key #1 single signature
	or
		Time lock T1, Bob single signature

Bob also signs a commitment transaction (corresponding to the transaction described above) and sends it to Alice:

Input #0, 10 BTC:
	Alice-Bob 2-of-2 multi-signature output (i.e., channel contract)

Output #0, 6 BTC:
	Bob single signature

Output #1, 4 BTC:
	Either
		Bob-Alice temporary joint public key #1 single signature
	or
		Time lock T1, Alice single signature

The trick to how Lightning Network can “revoke” old states lies in the “temporary joint public key,” which is generated using one’s own public key and the public key provided by the other party. For example, the Alice-Bob temporary joint public key is created by Alice using her own public key and Bob’s provided public key, each multiplied by a hash value and then added together. This temporary public key is generated in a way that no one knows its private key at the time of creation. However, if Bob shares the private key of the public key he provided, Alice can then calculate the private key of this temporary joint public key.

When the next time both parties want to update the channel’s state (initiate a payment), they exchange the private keys of the temporary public keys they each got in the previous round. This way, participants no longer dare to broadcast the commitment transaction they received in the previous round: the commitment transaction output allocating value for local party has two spending paths, and the private key for the temporary public key path is now known to remote party. So, if an old commitment transaction is boardcasted, remote party can immediately use the private key of joint temporary public key to claim all the funds for this output.—— This is the meaning of “LN-Penalty.”

To be more specific, the sequence of interactions goes as follows: the party initiating the payment first requests a new temporary public key from the counterparty. They then construct a new commitment transaction and hand it over to the counterparty. The party who receives the commitment transaction exposes the private key of the temporary public key they provided in the previous round. This sequence of interactions ensures that participants always receive the new commitment transaction before invalidating the commitment transaction they received in the previous round, making the process trustless.

In summary, the key design elements of the Lightning Channel are as follows:

  1. Both parties always use commitment transactions to represent the internal state of the contract and express payments through changes in amounts.
  2. Commitment transactions always spend the same input (requiring both parties to provide signatures for the input), making all commitment transactions mutually competitive. Ultimately, only one of them gets confirmed on the blockchain.
  3. The two participants do not sign the same commitment transaction (even though they come in pairs). They each sign transactions that are more favorable to themselves, in other words, the commitment transactions received by each participant are disadvantageous to them.
  4. This “disadvantageous” is reflected in the outputs that allocate value to themselves (local party), which have two unlocking paths: one path can be unlocked with their own signature but requires a waiting period, while the other path involves the counterparty’s public key and is protected only if their own temporary private key remains undisclosed.
  5. In each payment, both parties exchange the temporary private key used by the other party in the previous round with a new commitment transaction. This way, the party that has given away the temporary private key no longer dares to broadcast the old commitment transaction, effectively “revoking” the previous commitment transaction and updating the contract’s state. (In reality, these commitment transactions are all valid and can be broadcast on the blockchain, but participants refrain from doing so due to penalties.)
  6. Either party can exit the contract at any time by using the commitment transaction signed by the counterparty. However, if both parties are willing to cooperate, they can sign a new transaction that allows both of them to immediately retrieve their respective funds.

Finally, since commitment transactions can also include HTLCs, the lightning channel can also facilitate payments. Suppose Alice can find a path in the interconnected lightning channels to reach Daniel, she can make trustless multi-hop payments without having to open a direct channel with Daniel. This is the essence of the Lightning Network.

Alice -- HTLC --> Bob -- HTLC --> Carol -- HTLC --> Daniel

Alice <-- Pre-image -- Bob <-- Pre-image -- Carol <-- Pre-image -- Daniel

When Alice discovers such a path and wishes to pay Daniel, she requests a hash value from Daniel, using it to construct an HTLC to Bob and prompts Bob to forward a message to Carol while providing the same HTLC. The message to Carol then instructs her to forward the message to Daniel and provide the same HTLC. When the message reaches Daniel, he reveals the pre-image, thereby obtaining the value of the HTLC and updating the contract state. Carol follows the same process, receiving Bob’s payment and updating the channel state. Finally, Bob reveals the pre-image to Alice and updates the state. Due to the characteristics of HTLCs, this chain of payments either succeeds together or fails together, making it trustless.

The Lightning Network is composed of a network of channels, with each channel (contract) being independent of the others. This means that Alice only needs to be aware of what is happening within her channel with Bob and doesn’t need to concern herself with how many interactions have occurred in other people’s channels or what currencies have been used in those interactions. She doesn’t even need to know whether they have actually used the channel or not.

The scalability of the Lightning Network is evident in the fact that the payment speed within a single Lightning channel is only limited by the hardware resources of the involved parties, and also in the decentralized storage of channel states, allowing individual nodes to achieve the maximum leverage with minimal cost.

(III)Discreet Log Contract

Discreet Log Contract (DLC) employs a cryptographic technique called “adaptor signature” that allows Bitcoin scripts to programmatically create financial contracts dependent on external events.

Adaptor signatures make it so that a signature becomes valid only when a private key is added. Taking Schnorr signatures as an example, the standard form of a Schnorr signature is (R, s), where:

R = r.G
# The nonce value "r" used in the signature is multiplied by an elliptic curve point, also referred to as the public key of "r"

s = r + Hash(R || m || P) * p
# p = sighning private key,P = singing public key

Verifying a signature is verifying s.G = r.G + Hash(R || m || P) * p.G = R + Hash(R || m || P) * PK

Suppose I provide a pair of data, (R, s'), where:

R = R1 + R2 = r1.G + r2.G

s' = r1 + Hash(R || m || P) * p

Clearly, this is not a valid Schnorr signature, and it won’t pass the verification formula. However, I can prove to the verifier that, if they know the private key r2 for R2, it can be turned into a valid signature:

s'.G + R2 = R1 + Hash(R || m || P) * P + R2 = R + Hash(R || m || P) * P

Adaptor signatures make the validity of a signature dependent on a secret, and it is verifiable. But what does this have to do with financial contracts?

Suppose Alice and Bob want to bet on the outcome of a soccer match. Alice and Bob are betting on different teams, say the Green Magicians and the Red Arrows, with a wager of 1 BTC. Additionally, a sports analysis website, Carol, promises to publish a signature s_c_i on the outcome with a nonce R_c when the match result is revealed.

There are three possible outcomes (and thus three possible signatures from Carol):


-Green Magicians win, and Alice wins 1 BTC.
-Red Arrows win, and Bob wins 1 BTC.
-A draw results in a refund of their funds to both parties.

For this purpose, they create a commitment transaction for each possible outcome. For example, the commitment transaction they create for the first outcome is as follows:

Input #0, 2 BTC:
	Alice-Bob 2-of-2 multi-signature output (i.e., the betting contract)

Output #0, 2 BTC:
	Alice single signature

However, the signatures that Alice and Bob create for this transaction are not in the form of (R, s) but are adaptor signatures (R, s'). In other words, the signatures provided by both parties cannot be directly used to unlock this contract; A secret value must be revealed. This secret value is precisely the private key of s_c_1.G, which is Carol’s signature! Because Carol’s signature nonce value is already known (it’s R_c), s_c_1.G can be constructed (s_c_1.G = R_c + Hash(R_c || 'Green Magicians win' || PK_c) * PK_c).

When the result is revealed, let’s say the Green Magicians won, Carol will then publish the signature (R_c, s_c_1). In this case, both Alice and Bob can complete their opponent’s adaptor signature, add their own signature, making the transaction valid, and then broadcast it to the network, triggering the settlement effect. If the Green Magicians did not win, Carol will not publish s_c_1, and this commitment transaction cannot become a valid transaction.

Following the same logic, the other two transactions follow the same pattern. This way, Alice and Bob make the execution of this contract dependent on external events (more precisely, relying on an oracle of an external event in the form of a signature) without needing to trust the other party. Various types of financial contracts, including futures and options, can be implemented using this approach.

Compared to other forms of implementation, the most significant feature of Discreet Log Contracts (DLC) is its privacy:

  1. Alice and Bob do not need to inform Carol that they are using Carol’s data, and this has no impact on the execution of the contract.
  2. On-chain observers, including Carol, cannot deduce from Alice and Bob’s contract execution transactions which website’s services they are using. They can’t even determine whether their contract is a betting contract (as opposed to a Lightning Network channel contract).

4. Applications for Covenants

(I) OP_CTV & Congestion Control

The Bitcoin community developers have proposed various ideas that can be classified as covenant proposals. One of the most prominent proposals at the moment is OP_CHECKTEMPLATEVERIFY (OP_CTV). The concept is relatively simple but maintains significant flexibility, making it popular within the Bitcoin community that values simplicity. The idea behind OP_CTV is to commit a hash value to a script, making the transactions specified by this hash value to be the only ones able to spend this fund. This hash value commits to the transactions’ outputs and most fields but does not commit to the transactions’ inputs; it only commits to the quantity of inputs.

Congestion Control” is a great example that showcases the features of OP_CTV. Its primary use case is to assist a large number of users in moving from an exchange (a trusted environment) to a payment pool. Since this payment pool utilizes OP_CTV to plan for future spending, it can ensure that users can exit this payment pool in a trustless manner without assistance from anyone. Furthermore, because this payment pool is represented as a single UTXO, it avoids the need to pay high fees when on-chain transaction demand is high (reducing from n outputs to just 1 output). Users within the pool can also exit the pool at their convenience.

Suppose Alice, Bob, and Carol want to withdraw 5 BTC, 3 BTC, and 2 BTC, respectively, from the exchange, the exchange can create an output of 10 BTC with three OP_CTV branches. If Alice wants to make a withdrawal, she can use branch 1. The OP_CTV in branch 1 commits to a transaction represented by a hash, which will create two outputs. One output allocates 5 BTC to Alice, and the other output is another payment pool that uses OP_CTV to commit to a transaction, allowing Bob to withdraw 3 BTC and sending the remaining 2 BTC to Carol.

When Bob or Carol want to make a withdrawal, the process is similar. They can only use transactions that pass the corresponding OP_CTV checks, meaning they can only withdraw the specific amount allocated to them and cannot withdraw arbitrary amounts. The remaining funds will again enter a payment pool locked with OP_CTV, ensuring that regardless of the withdrawal order of users, the remaining users can trustlessly exit the pool.

In abstract terms, OP_CTV serves the purpose of planning paths for the contract towards the end of its lifecycle, ensuring that the payment pool contract, regardless of the path it follows and the state it reaches, maintains the property of trustless exits.

This type of OP_CTV has another interesting application: “silent one-way payment channel.” Suppose Alice creates such a payment pool and ensures that funds can be trustlessly withdrawn to an output with the following script:

Either Alice and Bob can spend it together, or after a certain period, Alice can spend it alone.

If Alice doesn’t reveal it to Bob, he won’t know about the existence of this output. Once Alice reveals it to Bob, he can treat this output as a time-bound one-way payment channel, and Alice can use the funds in it to pay Bob immediately without waiting for blockchain confirmations. Bob just needs to broadcast those commitment transaction (provided by Alice) to the blockchain before Alice can spend the fund alone.

(II)OP_Vault & Vaults

OP_VAULT is a covenant proposed for creating “vault contracts.”

Vault contracts aim to provide a more secure and advanced form of self-custody. While current multi-signature contracts eliminate the single point of failure of a single private key, if an attacker manages to obtain the threshold number of private keys, the wallet’s owner is defenseless. Vaults seek to impose a one-time spending limit on funds. As well as a an enforced waiting period when withdrawing through the regular path. During this waiting period, the withdrawal can be interrupted by emergency recovery wallet operations. With such a contract, even if the wallet is compromised, the owner can initiate a counteraction (using the emergency recovery branch).

In theory, OP_CTV can also be used to create such contracts, but there are several inconveniences, including one related to transaction fees. When a commitment transaction is made, it also commits the transaction fees it will pay. Given the long time intervals between setting up the contract and making withdrawals for such contracts’ use cases, it’s almost impossible to predict the appropriate fees. While OP_CTV doesn’t restrict inputs, adding more inputs to increase the transaction fee is impractical because the provided inputs would all be used for fees. Another method is Child Pays for Parent (CPFP), where the funds spent in the withdrawal are used to provide fees in a new transaction. Additionally, using OP_CTV means these vault contracts cannot be used for batch withdrawals (and, of course, batch recoveries).

The OP_VAULT proposal aims to address these issues by introducing new opcodes, namely OP_VAULT and OP_UNVAULT.

OP_UNVAULT is designed for batch recoveries, which we won’t discuss right now. The action of OP_VAULT is as follows: when placed on a branch of the script tree, it can be used to commit an operable opcode (e.g., OP_CTV) without specifying the exact parameters. When spending from this branch, the transaction can provide specific parameters but cannot modify other branches. As a result, it doesn’t need to pre-commit fees and can set fees when spending this branch. Assuming this branch also has a time lock, it will enforce a time lock. Finally, because it can only change the branch it is in, other branches on the script tree (including emergency recovery branches) will not be altered, allowing us to interrupt such withdrawal operations.

In addition, two more points are worth mentioning.

(1) The action of the OP_VAULT opcode is similar to another covenant proposal: [OP_TLUV](https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-September/019419.html). Jeremy Rubin correctly pointed out that this, to some extent, has introduced the concept of "computation.” OP_TLUV/OP_VAULT first commits an opcode to allow users to pass parameters for that opcode through a new transaction, thus updating the entire script tree leaf. This goes beyond “validating incoming data based on certain conditions” to “generating new meaningful data based on incoming data,” even though the range of computation enabled is relatively limited.

(2) The full OP_VAULT proposal also leverages some mempool policy proposals, such as v3 transaction format, to achieve better results. This reminds us that the meaning of “programming” can be broader than we might imagine. (A similar example is Open Transaction in the Nervos Network.)

5. Understand CKB

In the two previous sections, we discussed how we can program interesting applications using scripts in a more restricted structure (Bitcoin UTXO). We also introduced proposals aimed at adding introspection capabilities to this structure.

While UTXO has the ability to program these applications, you can also easily identify their drawbacks or areas for optimization, such as:

  • In LN-Penalty, channel participants must store every past commitment transaction and the corresponding penalty secret values to defend against possible cheats from their counterpart. This imposes a storage burden. If there were a mechanism to ensure that only the latest commitment transaction is effective, and older commitment transactions are not, it would eliminate this burden and protect nodes from accidentally penalizing themselves due to failures.
  • In DLC, assuming there are many possible outcomes for an event, there are numerous signatures that both parties must generate in advance and provide to each other, which is also a significant burden. Additionally, the revenue from DLC contracts is directly tied to public keys, making the position not easily transferable. Is there a way to transfer the position of the contract?

In fact, the Bitcoin community has already provided answers to these issues, primarily related to a Sighash proposal (BIP-118 AnyPrevOut). However, if we are programming on CKB, BIP-118 is essentially available now (using CKB’s introspection and the ability to selectively verify signatures to simulate this Sighash tag).

By learning Bitcoin programming, we not only understand how to program in the “transaction output” format (what CKB can program), but also how to improve these applications (how we can use CKB’s capabilities to enhance them when programming on CKB). For CKB developers, Bitcoin script programming can be considered as a learning resource and even a shortcut.

Next, we will analyze the programmability of each module in CKB programming, without considering introspection for now.

(I)Lock Script With Arbitrary Computation Programmability

As mentioned earlier, UTXOs cannot perform arbitrary computation. However, a Lock Script can, which means that it can program everything based on UTXOs (before deploying covenants), including but not limited to Lightning channels and DLCs.

Furthermore, this capability to verify arbitrary computations allows Lock Scripts to support more and more flexible authentication methods compared to UTXOs. For instance, you can implement a Lightning channel on CKB with one party using an ECDSA signature and the other party using an RSA signature.

Actually, this is one of the areas that led people to explore on CKB from the very beginning: utilizing this flexible authentication capability in user self-custody, thereby achieving the so-called "account abstraction” —— flexible authorization and control over transaction validity, as well as recovery of control, with almost no restrictions. In principle, this combines “multiple spending paths” and “arbitrary authentication method.” Examples of the implementations include joyid wallet and UniPass.

Moreover, Lock Scripts can also implement the eltoo proposal, enabling Lightning channels to only need to keep the latest commitment transaction (in fact, eltoo can simplify all point-to-point contracts).

(II)Type Script With Arbitrary Computation Programmability

As mentioned earlier, one major use case for Type Scripts is to program User-Defined Tokens (UDTs). Combined with Lock Scripts, this means we can implement Lightning channels (and other contract types) with UDT as the underlying asset.

The separation of Lock Script and Type Script can be seen as a security upgrade: Lock Script focuses on implementing custody methods or contract-based protocols, while Type Script focuses on defining UDTs.

Additionally, the ability to initiate checks based on UDT definitions allows UDTs to participate in contracts similar to CKBytes (native token on CKB) since UDT is a first-class citizen.

For example, I once proposed a protocol for trustless NFT-backed lending on Bitcoin. The key to this protocol is a commitment transaction with its inputs’ value less than the outputs’ value (making it not a valid transaction initially). However, once sufficient valued input is provided for this transaction, it becomes valid. If the borrower can repay the loan, the lender cannot take ownership of the pledged NFT. The trustless nature of this commitment transaction is based on checking the amounts of inputs and outputs. So, the borrower can only use Bitcoin for repayment, even if both the borrower and the lender are willing to accept another currency, such as USDT issued under the RGB protocol. Bitcoin’s commitment transactions cannot guarantee that returning enough USDT will result in the return of the borrower’s NFT because Bitcoin transactions have no knowledge of the status of USDT. In other words, it is not possible to construct a commitment transaction with repayment in USDT as a condition.

If we could perform checks based on the definition of UDT, it would enable the lender to sign another commitment transaction that allows the borrower to use USDT for repayment. The transaction would verify the input and output quantities of USDT, thus using USDT as a repayment method becomes trustless.

Revision: Suppose that the NFT used as collateral and the tokens used for repayment are issued using the same protocol, such as RGB. In that case, the issue can be resolved by constructing a commitment transaction within the RGB protocol that synchronizes the state transitions of the NFT and the repayment (binding two state transitions within the RGB protocol to one transaction). However, because RGB transactions also depend on Bitcoin transactions, constructing the commitment transaction here may be somewhat challenging. In summary, while the problem can be resolved, it does not make the token a first-class citizen.

Next, let’s consider introspection.

(III)Lock Script Introspects Other Lock Scripts

Lock Script introspecting other Lock Scripts implies that all programming possibilities on Bitcoin UTXO being reinforced with covenants, including the vault contract mentioned earlier and applications based on OP_CTV (such as congestion control), would be feasible.

XueJie once brought up a very interesting example: a receiving account Cell on CKB. When using this Cell as an input of a transaction, if the responding output Cell (with the same Lock Script) has more Capacity, this input does not require a signature, which means no signature does not affect the transaction’s validity. This is impossible without Cell’s the ability to introspect. This receiving account Cell is well-suited as an institutional receiving method, as it can aggregate funds, but it has the disadvantage of poor privacy.

(IV)Lock Script Introspects Other Type Scripts(& Data)

An interesting application of this feature is with equity tokens. The Lock Script can determine whether to utilize its own Capacity and where those funds can be spent based on the quantity of tokens in other inputs (which requires the introspection of the Lock Script).

(V)Type Script Introspects Other Lock Scripts

Assuming this could be be useful. For example, in the Type Script, it can be used to check whether the Lock Scripts of the inputs and outputs in a transaction remain unchanged.

(VI)Type Script Introspects Other Type Scripts(& Data)

Trading cards? Collecting n tokens can be exchanged for a larger token : )

6. Conclusion

Compared to previously introduced smart contract systems with arbitrary programmability, such as Ethereum, Nervos Network has adopted a different structure. Therefore, understanding Nervos Network can often be challenging based on prior knowledge of those smart contract systems. This post starts from BTC UTXO, a more constrained structure than CKB Cells, to provide a method for understanding the programmability of CKB Cells. By using the concept of “introspection” to understand Cells’ capability for “cross-contract access,” we can categorize scenarios that utilize introspection and determine their specific use cases.

Revision:

  1. Without considering Cell’s cross-contract access capability (i.e., introspection), lock scripts can be seen as stateful and highly programmable Bitcoin Scripts. Therefore, based on this point alone, you can program all applications based on Bitcoin Script.
  2. Without considering Cell’s introspection, the distinction between lock scripts and type scripts can be seen as a security upgrade: it separates the asset definition of UDT from the custody methods. Furthermore, state-exposing type scripts (as well as Data) achieve the effect of UDT being a first-class citizen.

These two points imply a paradigm similar to “BTC + RGB” but with greater programming capabilities.

  1. When considering Cell’s introspection, Cells can achieve greater programmability than post-covenants BTC UTXOs and implement things that are difficult to achieve with BTC + RGB (since BTC cannot read RGB state).

As for these use cases of the programmability of CKB, examples provided here are just a few due to my knowledge about the CKB ecosystem. However, I expect that as time goes on, people will channel more and more of their creativity into CKB, creating applications that are currently beyond our imagination.

9 Likes

wow, cool ~

4 Likes

Updated the piece thanks to @Ajian’s feedback!

3 Likes

Really appreciate your effort. It is more than what I deserve.

2 Likes

对于 CKB 开发者来说,简直可以将基于比特币脚本的编程当成一种学习的教材,甚至是捷径。

翻译完 ajian 的文章后,从以太坊的思维模式中解脱出来,对 CKB 有了更深的理解。现在大家比较难理解 UTXO、CKB 的概念可能因为是从 以太坊 的 account model 入门 从而产生了一些思维惯性,如果从比特币的角度真的能更快理解 CKB 的设计。

3 Likes

非常棒的文章,感谢分享!!

1 Like

看完以后对CKB的了解更深入了:saluting_face::saluting_face::saluting_face:

1 Like

好专业,这是写给开发者的,虽然不能完全看明白,但是感觉是如果从可编程和灵活性上看,CKB完全可以认为是一个超级版的BTC了,和ETH其实没有太多的相似的地方。

1 Like

CKB确实强,所以Cell studio诞生了

2 Likes