[Repost] Introduction to the RISC-V Instruction Set CKB-VM and its Requirements

Kelly · May 9, 2019, 2:35am

In this article, part 1 of a series, we introduce the Nervos CKB-Virtual Machine (CKB-VM), a RISC-V instruction set based VM for executing smart contracts and written in Rust.

We considered the features that the CKB-VM will require during virtual machine (VM) selection for our layer 1 blockchain Nervos CKB.

For a VM to be used on a blockchain, it must meet two mandatory conditions:

Certainty: For a fixed program and input, the VM must always return the same output result. The result must not be dependant on external conditions such as time or the running environment.
Security: The VM must not affect the operation of its host.

Nervos logo

Image courtesy of Nervos .

Above we have listed mandatory conditions, though we have considered the design of a VM that will best serve the CKB’s objectives. After our research, we propose the following features:

Flexibility

It is our goal to design a VM flexible enough to run for years or decades, allowing the CKB to evolve along with the field of cryptography. Current cryptographic primitives such as secp256k1 may no longer be used and valuable new primitives and techniques (such as Schnorr or post-quantum signatures) will continue to emerge. New innovations should be made available to programs running on the VM and primitives that are no longer used should be able to be disregarded.

To demonstrate our requirement, we can examine Bitcoin. Currently, Bitcoin makes use of SIGHASH for transaction signing and for consensus utilizes the SHA-256 hash algorithm. Can we say with certainty that SIGHASH will be best after several years or that SHA-256 will be suitable as a stable hash algorithm as computing capabilities increase over time? Currently a hard fork is required to implement new cryptographic primitives in all blockchain protocols we have examined.

In designing the CKB, we explored the possibility of reducing this hard fork requirement through the design of the VM layer.

The question we ask is, can we allow the update of cryptographic algorithms or addition of new transaction verification logic to the VM? While secp256k1 is still used, would the implementation of signature verification become more efficient if driven by economic incentives? If someone finds out a better way to implement algorithms used on CKB or just need a new cryptographic primitive, can we enable he/she to do so freely?

We hope the VM layer of the CKB will provide maximum flexibility and broad implementation support to enable the exploration of these type of questions. With the CKB-VM’s design, users will not need to wait for implementation of a hard fork to utilize new cryptographic innovations.

Runtime Visibility

After conducting research on the VM of existing blockchains, we have observed a problem. We will illustrate using Bitcoin as an example again: the VM layer of Bitcoin only provides one stack (and this VM does not know the amount of data that can be stored on the stack, or the stack depth). VMs implemented in stack mode also exhibit this issue.

Though the consensus layer may provide the stack depth definition or indirectly provide the stack depth (based on the code length or gas restriction), programs running on the VM cannot obtain the stack depth. Because developers of programs running on the VM must guess the program running status, programs running on the VM cannot use all potential capabilities of the VM.

Based on this thought, we have considered preferentially defining the restrictions of all resources during VM operation, including the gas restrictions and stack space size, and providing programs running on the VM the ability to query resource usage. This will allow programs running on the VM to utilize different algorithms based on resource availability. With this design, programs can utilize the VM’s full potential.

With this construction, we see more VM flexibility enabled in the following scenarios:

a) Different policies can be selected for contracts that store data, based on cell capacity. When there is sufficient cell capacity, a program can directly store data to reduce the number of CPU cycles used, or when cell capacity is constrained, the program can compress data to fit the smaller capacity, simply by utilizing more CPU cycles.

b) Different handling mechanisms can be selected for contracts according to the amount of cell data and size of remaining memory. When there is a small amount of cell data or substantial remaining memory, all cell data can be read to memory for handling. When there is a large amount of cell data or little remaining memory, a portion of memory can be read for each operation, an operation similar to swap memory.

c) For some common contracts, hash algorithms, for example, different handling methods can be selected based on the number of CPU cycles provided by the user. For example, SHA3–256 is secure enough to meet most requirements, however, a contract may utilize the SHA3–512 algorithm to meet different security requirements by using more CPU cycles.

Runtime Overhead

The gas mechanism in the Ethereum virtual machine (EVM) is a brilliant design. It solves the halting problem in the context of a blockchain VM, and allows for turing-complete computation on a fully decentralized virtual machine. However, we have observed that it is difficult to design a proper gas computation method for different EVM opcodes.

We have found that the EVM has adjusted the gas computation mechanism in almost every version update. We wonder: can we ensure more efficient overhead calculation through VM design?

We have considered the design of a VM that can deliver all of these features and have found that there is not an existing solution that can achieve our vision of the CKB.

Proposed Solution: Utilizing RISC-V

RISC-V is an open-source RISC instruction set architecture (ISA) designed by professors of the University of California, Berkeley in 2010. The aim of RISC-V is to provide a common CPU ISA that enables the next generation of system architecture development for several decades without the burden of legacy architecture issues.

The RISC-V logo. Image courtesy of the RISC-V Foundation.

RISC-V can meet implementation requirements ranging from small-sized microprocessors with low power consumption to high-performance data center (DC) processors in all scenarios. Compared with other CPU instruction sets, the RISC-V instruction set has the following advantages:

Openness

Both the core design and implementation of RISC-V are provided under a BSD license. All companies and agencies can utilize the RISC-V instruction set and create new hardware/software without restriction.

Simplicity

As a RISC instruction set, the 32-bit integer core instruction set of RISC-V has only 41 instructions. Adding support for 64-bit integers, the instruction set only has about 50 instructions. An x86 instruction set may have thousands of instructions, compared to this, the RISC-V instruction set is easily implemented and prevents bugs while providing the same functionality.

Modular Mechanism

With a simplified core, RISC-V also provides a modular mechanism to provide more extended instruction sets. For example, the CKB might choose to implement the V extension defined in the RISC-V core to support vector computing or add extended instruction sets for 256-bit integer computing, providing the possibility of high-performance cryptographic algorithms.

Wide Support

The RISC-V instruction set is supported by compilers such as GCC and LLVM. Rust and Go language implementations based on RISC-V are being developed. The VM implementation of CKB will use the widely implemented ELF format, CKB VM contracts can be developed using any language that can be compiled to RISC-V instructions.

Maturity

The RISC-V core instruction set has been finalized and frozen, all RISC-V implementations in the future need to be backward compatible. This removes the possibility of a CKB hard fork resulting from a VM instruction update. Additionally, the RISC-V instruction set has hardware implementations and has been verified in real-world application scenarios. RISC-V does not have the potential risks that may exist in other less-supported instruction sets.

Even though other instruction sets may have some of the qualities listed above, the RISC-V instruction set is the only one that delivers all of them according to our evaluation. Based on this, we have chosen to implement CKB VM with the RISC-V instruction set and utilize the ELF format for smart contracts to ensure wide language support.

In addition, we will add dynamic linking for CKB VM to ensure cell sharing. Even though the official CKB implementation will provide popular cryptographic primitives, we encourage the community to provide more optimized cryptographic algorithm implementations to reduce runtime overhead (CPU cycles).

The topic of developer incentives to improve cryptographic primitives on CKB is interesting and has been frequently discussed among the CKB team. It is our hope that the CKB VM will develop and improve with the evolution of cryptography and community, without the need for hard forks to upgrade the protocol.

Original link：https://www.allaboutcircuits.com/industry-articles/introduction-to-the-risc-v-instruction-set-ckd-vm-and-its-requirements/
Author：XueJie Xiao