# [Spark Program] CKB-VM Sail Formal Verification — Proving CKB-VM RISC-V Instruction Equivalence via Sail Specification and Coq Theorem Prover / CKB-VM Sail 形式化验证 — 基于 Sail 规范与 Coq 定理证明器的 CKB-VM RISC-V 指令等价性证明
-–
## English Version
### 1. Project Name and Summary
**Project Name:** ckb-vm-sail-verify
**One-line Summary:** Formally verify that CKB-VM’s RISC-V instruction execution semantics are mathematically equivalent to the official Sail RISC-V specification, using Coq theorem proofs and differential testing as dual verification.
### 2. Team / Individual Introduction
**Applicant:** Tinyueng ([GitHub](TinyuengKwan (Tinyueng) · GitHub))
**Core Competencies:** Currently interning at PLCT Lab (Institute of Software, Chinese Academy of Sciences) on the **Sail & ACT (RISC-V Architectural Certification Tests) team**, with direct, hands-on experience in the Sail architecture definition language and RISC-V conformance testing. Contributed to the [sail-riscv](GitHub - riscv/sail-riscv: Sail RISC-V model · GitHub) ecosystem and developed [sail-lsp](GitHub - TinyuengKwan/sail-lsp · GitHub), a Language Server Protocol implementation for Sail. Familiar with the Sail compiler toolchain (including the `–coq` backend for theorem prover integration), the sail-riscv formal model structure, and the RISC-V Architectural Certification Test framework. Possesses strong Rust systems programming capabilities (ownership, lifetimes, trait system). Systematic study of CS:APP with a complete knowledge framework spanning processor architecture, virtual memory, ELF linking, and system-level I/O. Additionally studied compiler theory and x86 assembly.
**Relevant Domain Knowledge:** Deep familiarity with the Sail formal specification language — not just as a user, but as a tooling contributor (sail-lsp). Hands-on experience with RISC-V Architectural Certification Tests (ACT), understanding how conformance is validated against the Sail reference model. Familiar with RISC-V ISA architecture (RV64IMAC, privilege levels, extension mechanisms). Experience with the Coq interactive theorem prover and Sail’s Coq backend output. Understanding of CKB-VM internals through source code analysis of nervosnetwork/ckb-vm, including its versioned instruction semantics, MOP (Macro-Op Fusion) extensions, and flat memory model.
### 3. Problem Description
CKB-VM is the RISC-V virtual machine that powers all on-chain computation in the Nervos CKB network. Every CKB transaction — every lock script, type script, and smart contract — executes inside CKB-VM. Its correctness is the security foundation of the entire chain. Yet today, there is no formal proof that CKB-VM faithfully implements the RISC-V specification.
**No formal verification of instruction semantics.** CKB-VM implements 158 opcodes across RV64I, M, C (Zca), B (Zba/Zbb/Zbc/Zbs), A, and custom MOP extensions. Each instruction is hand-written in Rust. While the implementation is high-quality and battle-tested, no mathematical proof exists that any single instruction behaves identically to the RISC-V specification. This is [nervosnetwork/ckb-vm#190](Formally Verify CKB-VM via Sail · Issue #190 · nervosnetwork/ckb-vm · GitHub), open since 2021.
**Testing alone cannot guarantee correctness.** CKB-VM passes the standard riscv-tests suite, but test suites are finite — they cannot cover all possible input states, edge cases, or corner-case interactions. A subtle semantic divergence (e.g., incorrect sign extension on a rare operand combination) could go undetected for years until exploited.
**No authoritative ground truth for comparison.** The RISC-V ISA specification is written in natural language (English prose), which is inherently ambiguous. Different implementors may interpret the same sentence differently. Without a machine-readable formal specification as the reference, “correctness” remains a matter of human judgment.
**Sail RISC-V solves the ground truth problem.** RISC-V International has adopted Sail as the official formal specification language for RISC-V. The [sail-riscv](GitHub - riscv/sail-riscv: Sail RISC-V model · GitHub) model is the machine-readable, executable, mathematically precise definition of RISC-V instruction semantics. Crucially, Sail can compile to Coq — enabling theorem proving directly against the official specification.
**Semantic gaps between CKB-VM and standard RISC-V.** CKB-VM is not a generic RISC-V implementation. It has a flat 4MB memory model (no MMU/virtual memory), custom ECALL handling (syscall 93 for exit, not M-mode trap), three VERSION modes with different behavioral semantics, cycle-counting per instruction, and MOP fusion instructions that have no counterpart in the standard specification. These gaps must be formally documented and their impact on equivalence precisely characterized.
**Real-world impact:** A formally verified CKB-VM would provide the strongest possible correctness guarantee for CKB’s execution layer. It would eliminate an entire class of potential vulnerabilities — instruction-level semantic bugs — that no amount of testing can fully prevent. For a blockchain whose security model depends entirely on deterministic execution, this is a foundational investment.
### 4. Solution
#### 4.1 Core Approach
ckb-vm-sail-verify employs a dual verification strategy: **Coq formal proofs** for mathematical certainty and **differential testing** for practical runtime validation. The two approaches are complementary — formal proofs cover all possible inputs for proven instructions, while differential testing validates the full instruction set against concrete test vectors.
#### 4.2 Why Sail + Coq
**Sail is the official RISC-V specification.** Unlike informal prose specifications, Sail is executable, unambiguous, and maintained by RISC-V International. Using Sail as ground truth means we verify against the same specification that hardware vendors use.
**Sail compiles to Coq.** The Sail compiler’s `–coq` backend generates Coq definitions from Sail source. This gives us machine-checkable RISC-V semantics inside Coq — the same theorem prover used by CompCert (verified C compiler) and seL4 (verified OS kernel).
**Coq proofs are total.** Once a Coq theorem is proved, it holds for all possible inputs — not just test vectors. A proven `ADD` instruction is correct for every combination of source registers and machine states, unconditionally.
**Reproducible and auditable.** Coq proofs are machine-checked. Anyone can re-run `make coq` and verify every theorem independently. No trust required in the prover — only in Coq’s kernel, which is one of the most scrutinized pieces of software in existence.
#### 4.3 Four-Layer Verification Architecture
**Layer 1: Sail RISC-V Formal Specification (Ground Truth)**
The official sail-riscv model, compiled to Coq using `sail --coq`. Generates ~30,000 lines of Coq defining the complete RV64 instruction semantics in a monadic state transformer style. This is the authoritative reference — what RISC-V *should* do.
**Layer 2: CKB-VM Coq Model (Verification Target)**
A hand-written Coq model of CKB-VM’s Rust interpreter logic, capturing register file operations (32 × 64-bit, x0 hardwired to zero), PC advancement, ALU computations, memory access (little-endian, flat 4MB), and branch/jump semantics. Based on `handle_*` functions in ckb-vm’s `execute.rs`. This is what CKB-VM *actually does*.
**Layer 3: Equivalence Proofs (Mathematical Bridge)**
For each target instruction, prove: `ckb_vm_execute(I, state) = sail_execute(I, state)`. Each proof decomposes into three sub-theorems: (a) **Semantics** — destination register has the correct computed value; (b) **PC update** — next PC is PC+4 (or PC+offset for branches/jumps); (c) **Register isolation** — all other registers are unchanged. Proof technique: unfold definitions → rewrite with helper lemmas → solve with `lia`/`reflexivity`.
**Layer 4: Differential Testing (Runtime Validation)**
Execute identical RISC-V ELF binaries on both CKB-VM (Rust) and the Sail C++ emulator, then compare execution traces step-by-step — PC values, register states, and exit codes. Complements formal proofs by covering the full instruction set (including instructions not yet formally proven) and catching implementation bugs that might not be visible in the Coq model.
#### 4.4 State Equivalence Definition
The core challenge is bridging Sail’s monadic Coq output (stateful monad transformer with effects) and CKB-VM’s pure functional Coq model. We define state equivalence as:
```
state_equiv(ckb_state, sail_state) :=
∀ i ∈ [0, 31], get_reg(ckb_state, i) = extract_reg(sail_state, i)
∧ ckb_state.pc = extract_pc(sail_state)
∧ ∀ addr ∈ [0, 4MB), load_byte(ckb_state, addr) = extract_mem(sail_state, addr)
```
This strips away Sail’s monad and CKB-VM’s cycle accounting, comparing only the observable architectural state that must agree.
#### 4.5 CKB-VM Specific Considerations
| Aspect | CKB-VM Behavior | Sail RISC-V Behavior | Handling |
|--------|-----------------|---------------------|----------|
| x0 register | Write then clear to zero | Discard writes silently | Functionally equivalent — prove in Coq |
| ECALL | Dispatch via A7 (syscall 93 = exit) | Trap to M-mode handler | Filter in diff-test; exclude from Coq scope |
| Memory model | Flat 4MB, no MMU, W^X | Sv39/48/57 full MMU | CKB-VM simplified model; prove within 4MB subset |
| FENCE | No-op (single-threaded) | Full fence semantics | No observable difference in single-hart model |
| Atomics (A ext) | Single address reservation | Reservation set | Limited concurrency; prove single-hart subset |
| VERSION modes | Three modes (0/1/2) | None | Target VERSION2 only |
| Cycle counting | Per-instruction cost tracking | None | Filter out; not part of architectural state |
| MOP fusion | WIDE_MUL, FAR_JUMP_REL, ADC | No counterpart | Verify as compositions of standard instructions |
#### 4.6 Proof Example
```coq
(* For ADD instruction: rd = rs1 + rs2 (mod 2^64), PC += 4 *)
Theorem ckb_add_semantics : forall rd rs1 rs2 st,
rd <> 0 ->
get_reg (ckb_add rd rs1 rs2 st) rd =
truncate_64 (get_reg st rs1 + get_reg st rs2).
Proof.
intros. unfold ckb_add, set_reg, get_reg.
destruct (Nat.eq_dec rd rd); [| contradiction].
apply truncate_64_mod. Qed.
Theorem ckb_add_pc : forall rd rs1 rs2 st,
(ckb_add rd rs1 rs2 st).(pc) = st.(pc) + 4.
Proof. intros. unfold ckb_add. simpl. lia. Qed.
Theorem ckb_add_isolation : forall rd rs1 rs2 st r,
r <> rd ->
get_reg (ckb_add rd rs1 rs2 st) r = get_reg st r.
Proof.
intros. unfold ckb_add, set_reg, get_reg.
destruct (Nat.eq_dec r rd); [contradiction | reflexivity]. Qed.
```
### 5. Detailed Technical Implementation Plan
#### 5.1 Phase 0: Environment Setup and Instruction Mapping (Week 1)
Install the complete toolchain: Sail (>= 0.20.x via OPAM), Coq (9.0.0 + coq-sail-stdpp), CMake, Rust (1.92.0+), GMP, and RISC-V cross-compiler (`gcc-riscv64-unknown-elf`). Clone sail-riscv as a git submodule. Build the Rust workspace (`lib/` + `crates/diff-test/`). Generate Coq from Sail via `sail --coq --dcoq-undef-axioms`. Build the Sail C++ emulator for differential testing.
Create the complete CKB-VM instruction mapping table: map all 158 CKB-VM opcodes to their Sail RISC-V counterparts, classify by extension (I/M/C/A/B/MOP), and identify instructions with no Sail equivalent (MOP fusion ops).
#### 5.2 Phase 1: Differential Testing Framework (Week 2)
Implement the diff-test CLI tool with step-by-step execution trace comparison:
**CKB-VM side (`lib/src/runner.rs`):** Wrap CKB-VM in a step-by-step executor that captures `StepState` (PC + 32 registers) after each instruction. Handle exit syscall (syscall 93 via A7 register). Support `–max-steps` to prevent infinite loops.
**Sail side (`crates/diff-test/src/sail_runner.rs`):** Parse the Sail C++ emulator’s trace output format. Extract PC and register values per step. Handle ECALL divergence (riscv-tests use `tohost` memory-mapped I/O; CKB-VM uses syscall).
**Comparison engine (`lib/src/state.rs`):** `CompareResult` with `first_mismatch` detection — reports the exact step, PC, and register where traces diverge. Support `–verbose` mode for full trace dump and `–json` for machine-readable output.
**Test corpus:** Auto-discover ELF test artifacts from riscv-tests (rv64ui, rv64um, rv64uc, rv64ua), riscv-arch-test, and CKB-VM’s own test suite.
#### 5.3 Phase 2: Coq Infrastructure and Sail Interface (Week 3)
This is the most challenging phase — bridging Sail’s monadic Coq output with CKB-VM’s pure model.
**Study generated Coq (~30,000 lines):** Understand Sail’s state monad encoding, register access patterns, bitvector arithmetic library, and instruction decode/execute structure. Identify the Coq functions corresponding to each RISC-V instruction (e.g., `execute_RISCV_ADD`).
**Define state equivalence relation:** Formalize the mapping between CKB-VM’s `machine_state` record and Sail’s monadic state. Handle type mismatches (Sail uses bitvectors; CKB-VM model uses Z integers).
**Build proof infrastructure:** Create reusable Coq tactics (`solve_truncate`, `simplify_regs`, `unfold_sail_step`) that automate common proof patterns. Test with a minimal import: `Check execute_RISCV_ADD`.
**Document the bridging strategy:** Record Sail’s Coq structure, naming conventions, and the exact approach for extracting observable state from monadic computations.
#### 5.4 Phase 3: Core ALU Instruction Proofs (Week 4)
Prove 7 core instructions with full three-theorem coverage (semantics + PC + register isolation):
| Instruction | Type | Key Proof Challenge |
|-------------|------|-------------------|
| ADD | R-type | `truncate_64` idempotence, x0 hardwiring |
| SUB | R-type | Unsigned subtraction mod 2^64 |
| ADDI | I-type | Sign extension of 12-bit immediate |
| SLLI | Shift | Shift amount masking (lower 6 bits for RV64) |
| SRLI | Shift | Logical vs. arithmetic right shift distinction |
| SRAI | Shift | Sign bit preservation across arithmetic shift |
| MUL | M-ext | 64-bit multiply, lower 64 bits of 128-bit product |
Build reusable lemma library: `truncate_64_idempotent`, `x0_always_zero`, `get_set_reg_same`, `get_set_reg_diff`, `sign_extend_properties`.
#### 5.5 Phase 4: Control Flow and Memory Proofs (Week 5)
**Branch instructions:**
- BEQ: Prove both taken path (PC += offset) and not-taken path (PC += 4). Handle sign-extended branch offset.
- Additional branches (BNE, BLT, BGE, BLTU, BGEU) follow the same pattern.
**Jump instructions:**
- JAL: Prove link address (rd = PC + 4) and jump target (PC = PC + offset). Handle x0 case (no link).
- JALR: Prove indirect jump with LSB cleared.
**Memory instructions:**
- Build memory lemmas: store/load roundtrip (`load(store(addr, val)) = val`), byte-ordering (little-endian), alignment.
- LW: Prove sign-extended 32-bit load.
- SW: Prove 32-bit store (may partially Admit if byte-level memory model complexity exceeds time budget).
#### 5.6 Phase 5: Extended Proofs and MOP Verification (Week 6)
**Additional ALU proofs:** AND, OR, XOR, SLTI, SLTIU, LUI, AUIPC.
**MOP fusion verification:** CKB-VM’s custom MOP instructions are implemented as multi-instruction fusions. Verify that:
- `WIDE_MUL(rd1, rd2, rs1, rs2)` = sequential `MULH(rd1, rs1, rs2); MUL(rd2, rs1, rs2)`
- `FAR_JUMP_REL(rd, offset)` = `AUIPC(rd, upper); JALR(rd, rd, lower)`
**Proof audit:** Review all `Admitted` lemmas. Attempt to close as many as possible. Document remaining admits with clear justification.
**Target:** 10+ core instructions fully proved (zero Admitted), 5+ additional instructions with partial proofs.
#### 5.7 Phase 6: Full Differential Test Coverage and Gap Analysis (Week 7)
**Per-extension testing:** Run differential tests across rv64ui (integer), rv64um (multiply/divide), rv64uc (compressed), rv64ua (atomic) test suites. Add `–extension` CLI filter for targeted runs.
**Edge case tests:** Division by zero behavior, maximum shift amounts, integer overflow/underflow, misaligned memory access, x0 write attempts.
**Semantic gap documentation:** Produce a formal gap analysis document covering all 8 identified semantic gaps (x0 handling, ECALL, memory model, FENCE, atomics, VERSION modes, cycle counting, MOP), with precise characterization of each gap’s impact on equivalence claims.
**Instruction coverage matrix:** Update the 158-opcode mapping table with proof status (Proved / Partially Proved / Diff-Test Only / Not Covered) and test status (Pass / Fail / Skipped / N/A).
#### 5.8 Phase 7: Documentation, Deliverables, and Roadmap (Week 8)
**Clean-room build test:** Clone from scratch, `make all` succeeds, all Coq proofs compile, differential tests pass.
**Completion report:** Number of theorems proved, number of instructions with full coverage, number of differential tests passing, complete gap analysis, remaining `Admitted` proofs with justification.
**Phase 2+ roadmap:** Full RV64I coverage (~50 instructions), M/C/B extension proofs, ASM-mode verification (requiring Islaris-level tooling, ~6+ months, $20k+ scope).
**Code quality:** `cargo fmt`, `cargo clippy`, remove temporary files, ensure CI-ready build.
### 6. Expected Deliverables
#### 6.1 Core Deliverables
1. **Coq proof library (coq/)** — Formally proved equivalence theorems for 10+ core RISC-V instructions, covering RV64I ALU operations, shifts, branches, jumps, and at least one M-extension instruction. Each instruction proved with three sub-theorems (semantics, PC update, register isolation). All proofs machine-checked by Coq — zero trust required.
2. **Differential testing framework (crates/diff-test/)** — CLI tool that executes identical ELF binaries on CKB-VM and Sail C++ emulator, comparing execution traces step-by-step. Supports `–verbose`, `–json`, `–max-steps`, `–test-dir` flags. Reports first divergence point with full state dump.
3. **CKB-VM instruction mapping document** — Complete mapping of all 158 CKB-VM opcodes to Sail RISC-V counterparts, classified by extension, with proof/test status annotations.
4. **Formal semantic gap analysis** — Rigorous documentation of all 8 identified semantic gaps between CKB-VM and standard RISC-V, with precise characterization of impact on equivalence claims and mitigation strategies.
5. **Methodology documentation (doc/)** — Architecture guide and proof methodology document enabling future contributors to extend proofs to additional instructions without re-learning the entire framework.
6. **Build-from-source reproducibility** — Single `make all` command builds everything (Coq proofs + Sail emulator + Rust diff-test tool). Documented toolchain requirements and setup script.
7. **Phase 2+ roadmap** — Detailed plan for extending to full RV64I, M/C/B extensions, and ASM-mode verification, suitable for Community Fund DAO-scale proposal.
#### 6.2 Concrete Output Examples
##### Example A — Coq Proof Compilation Output
```
$ make coq
coqc -R . CkbVmVerify -R generated Riscv MachineState.v
coqc -R . CkbVmVerify -R generated Riscv CkbVmModel.v
coqc -R . CkbVmVerify -R generated Riscv InstructionEquiv.v
Proved: ckb_add_semantics
Proved: ckb_add_pc
Proved: ckb_add_isolation
Proved: ckb_sub_semantics
Proved: ckb_sub_pc
Proved: ckb_sub_isolation
Proved: ckb_addi_semantics
…
────────────────────────────────────────
Total: 33 theorems proved, 0 admitted.
All proofs verified by Coq 9.0.0. ![]()
```
All theorems are machine-checked. Any modification that breaks a proof will cause `make coq` to fail — continuous verification by construction.
##### Example B — Differential Test Output (Pass)
```
$ cargo run --release -p ckb-vm-diff-test – \
--test-dir tests/rv64ui/ --sail-bin deps/sail-riscv/build/sail_riscv_sim
Running 55 differential tests (rv64ui)…
rv64ui-p-add … CKB-VM: 127 steps, Sail: 127 steps →
MATCH
rv64ui-p-addi … CKB-VM: 98 steps, Sail: 98 steps →
MATCH
rv64ui-p-and … CKB-VM: 112 steps, Sail: 112 steps →
MATCH
rv64ui-p-andi … CKB-VM: 95 steps, Sail: 95 steps →
MATCH
rv64ui-p-auipc … CKB-VM: 43 steps, Sail: 43 steps →
MATCH
rv64ui-p-beq … CKB-VM: 156 steps, Sail: 156 steps →
MATCH
…
────────────────────────────────────────
Results: 55 passed, 0 failed, 0 skipped.
All execution traces match. ![]()
```
##### Example C — Differential Test Output (Divergence Detected)
```
$ cargo run --release -p ckb-vm-diff-test – \
--elf tests/edge/division_by_zero.elf --verbose
Step 1: PC=0x80000000 CKB-VM
Sail ![]()
Step 2: PC=0x80000004 CKB-VM
Sail ![]()
Step 3: PC=0x80000008 CKB-VM
Sail ![]()
Step 4: PC=0x8000000c
DIVERGENCE DETECTED
CKB-VM state:
PC = 0x80000010
x10 = 0xffffffffffffffff ← DIV by zero → all-ones
Sail state:
PC = 0x80000010
x10 = 0xffffffffffffffff
Register diff: (none — values match at this step)
First divergence: step 4, register x10
Note: Both implementations return all-ones for DIV by zero
(RISC-V spec mandated behavior). Trace matches.
────────────────────────────────────────
Result:
MATCH (divergence was false alarm after full comparison)
```
##### Example D — JSON Machine-Readable Output
```
$ cargo run --release -p ckb-vm-diff-test – \
--elf tests/rv64ui/rv64ui-p-add --json
{
“elf”: “tests/rv64ui/rv64ui-p-add”,
“ckb_vm_steps”: 127,
“sail_steps”: 127,
“ckb_vm_exit_code”: 0,
“sail_exit_code”: 0,
“match”: true,
“first_mismatch”: null,
“trace_comparison”: {
"total_steps_compared": 127,
"registers_compared_per_step": 32,
"all_match": true
}
}
```
#### 6.3 Reproducible Build Environment
```bash
git clone --recursive https://github.com/xxxx/ckb-vm-sail-verify.git
cd ckb-vm-sail-verify
make all # Builds everything: Coq proofs + Sail emulator + diff-test tool
make test # Runs differential tests against riscv-tests suite
```
**Toolchain requirements documented in README:**
- Rust >= 1.92.0
- OPAM with Sail >= 0.20.x, Coq 9.0.0, coq-sail-stdpp
- CMake >= 3.20
- GMP development library
- RISC-V cross-compiler (for custom test assembly)
**`scripts/` directory** provides automated setup:
- `generate_coq.sh` — Generate Coq from Sail source
- `build_sail_emulator.sh` — Build Sail C++ emulator
- `run_differential.sh` — Run full differential test suite
#### 6.4 Acceptance Criteria
##### Formal Verification Criteria
- V-1: `make coq` compiles all Coq proofs without errors or warnings on a clean environment.
- V-2: At least 10 RISC-V instructions have complete three-theorem proofs (semantics + PC + isolation) with zero `Admitted`.
- V-3: At least one M-extension instruction (MUL) is formally proved.
- V-4: All helper lemmas (`truncate_64_idempotent`, `x0_always_zero`, `get_set_reg_*`) are fully proved.
- V-5: State equivalence relation is formally defined and used consistently across all proofs.
##### Differential Testing Criteria
- T-1: Differential tests pass on the complete rv64ui test suite (55 tests).
- T-2: Differential tests pass on rv64um test suite (13 tests).
- T-3: The diff-test tool correctly detects and reports injected semantic divergences (negative testing).
- T-4: `–json` output is valid JSON parseable by `jq`.
- T-5: Edge case tests (division by zero, max shift, overflow) produce matching traces.
##### Documentation Criteria
- D-1: Complete 158-opcode mapping table with proof/test status.
- D-2: Semantic gap analysis covers all 8 identified gaps with impact assessment.
- D-3: Clean-room build test: fresh clone → `make all` succeeds → `make test` passes.
- D-4: Bilingual (English + Chinese) README with quick-start instructions.
### 7. Funding Request and Usage
**Total Amount Requested:** 1,000 USD
**Payment Method:** 100% CKB
| Category | Amount | Description |
|----------|--------|-------------|
| Cloud Server | $350 USD | 1 VPS (Linux, ≥ 4 cores 16GB RAM) for Coq compilation (memory-intensive), Sail emulator builds, and differential test execution. 8 weeks. |
| Developer Compensation | $450 USD | Core development work. Estimated 20–30 hours per week, 8 weeks. Covers Coq proof development, Rust diff-test framework, and Sail integration. |
| Documentation & Community | $200 USD | Bilingual documentation, architecture diagrams, 2 monthly sharing sessions, completion report, and Phase 2+ roadmap preparation. |
### 8. Estimated Completion Timeline
**Total Duration:** 8 weeks (~2 months)
#### Phase 1: Infrastructure and Feasibility (Week 1–3)
**Week 1:** Complete toolchain installation (Sail + Coq + CMake + Rust + RISC-V cross-compiler). Clone sail-riscv submodule. Generate Coq from Sail. Build Sail C++ emulator. Build Rust workspace. Create 158-opcode instruction mapping table.
**Week 2:** Implement differential testing framework — CKB-VM step executor, Sail trace parser, comparison engine, CLI tool with `–verbose`/`–json`/`–max-steps`. Handle ECALL divergence between riscv-tests and CKB-VM. Run initial diff-tests on rv64ui suite.
**Week 3:** Study Sail-generated Coq (~30,000 lines). Define state equivalence relation. Build proof infrastructure and reusable tactics. Create minimal test import (`Check execute_RISCV_ADD`). Document Sail Coq bridging strategy. **Most challenging week** — debugging imports, name conflicts, library dependencies.
**Milestone 1 (End of Week 3):** Differential testing framework operational with rv64ui tests passing. Coq infrastructure compiles against Sail-generated definitions. Instruction mapping table complete.
#### Phase 2: Formal Proofs (Week 4–6)
**Week 4:** Prove 7 core ALU instructions (ADD, SUB, ADDI, SLLI, SRLI, SRAI, MUL) with full three-theorem coverage. Build reusable lemma library.
**Week 5:** Prove control flow instructions (BEQ taken/not-taken, JAL link+jump). Build memory lemmas. Prove LW, SW. Submit mid-term progress report.
**Week 6:** Prove additional ALU instructions (AND, OR, XOR, SLTI, JALR). Verify MOP fusion equivalence (WIDE_MUL = MULH + MUL). Audit all `Admitted` proofs.
**Milestone 2 (End of Week 6):** 10+ instructions with complete proofs. MOP fusion verified. Mid-term report submitted.
#### Phase 3: Integration and Delivery (Week 7–8)
**Week 7:** Full differential test coverage across rv64ui/rv64um/rv64uc/rv64ua. Edge case testing. Produce formal semantic gap analysis document. Update instruction coverage matrix with proof/test status.
**Week 8:** Clean-room build test. Bilingual documentation. Code cleanup (fmt, clippy). Completion report. Phase 2+ roadmap. Community sharing session. Final submission.
**Milestone 3 (End of Week 8):** All deliverables submitted — Coq proof library, diff-test framework, instruction mapping, gap analysis, methodology docs, and roadmap.
#### Timeline Overview
| Phase | Weeks | Focus | Milestone |
|-------|-------|-------|-----------|
| Phase 1: Infrastructure | Week 1–3 | Toolchain, diff-test framework, Coq infrastructure, instruction mapping | Milestone 1 |
| Phase 2: Formal Proofs | Week 4–6 | Core ALU proofs, control flow, memory, MOP, proof audit | Milestone 2 (Week 6) |
| Phase 3: Integration & Delivery | Week 7–8 | Full test coverage, gap analysis, docs, clean-room build, submission | Milestone 3 |
### 9. Relevance to CKB Ecosystem
**Addresses a long-standing open issue.** [ckb-vm#190](Formally Verify CKB-VM via Sail · Issue #190 · nervosnetwork/ckb-vm · GitHub) has been open since 2021, requesting formal verification of CKB-VM’s RISC-V implementation. This project directly delivers a proof-of-concept solution.
**Strengthens CKB’s security foundation.** CKB-VM executes every transaction on the Nervos network. A formally verified instruction set eliminates an entire class of potential vulnerabilities — instruction-level semantic bugs that testing cannot fully prevent. This is especially critical for a blockchain where deterministic execution is a security invariant.
**Establishes a reusable verification framework.** The Coq proof infrastructure, differential testing tool, and methodology documentation are designed for extension. Future contributors can add proofs for additional instructions following the established patterns, without re-learning the framework.
**Demonstrates CKB’s technical depth.** Formal verification of a blockchain VM is rare in the industry. CompCert verified a C compiler; seL4 verified an OS kernel; this project brings the same rigor to CKB’s execution layer. It positions CKB alongside the most technically ambitious blockchain projects.
**Enables future full-ISA verification.** This PoC covers 10+ instructions and establishes the methodology. The Phase 2+ roadmap (full RV64I, extensions, ASM-mode) provides a clear path to comprehensive verification suitable for a Community Fund DAO proposal.
**Pure Rust + Coq toolchain.** The differential testing framework is pure Rust, consistent with CKB’s technology stack. Rust developers in the CKB community can contribute without learning new languages (Coq proofs are the specialist component, but the testing framework is accessible to all).
### 10. Technical Risks and Mitigations
| Risk | Impact | Probability | Mitigation |
|------|--------|-------------|------------|
| Sail Coq output structure changes between versions | High | Low | Pin sail-riscv to a specific commit. Document exact Sail compiler version. Provide `generate_coq.sh` for reproducible regeneration. |
| Sail monadic Coq too complex to bridge with CKB-VM pure model | High | Medium | Week 3 is dedicated to this. Fallback: extract observable state post-execution rather than proving step-internal equivalence. Worst case: prove against a simplified Sail extract rather than raw monadic output. |
| Coq compilation time exceeds development iteration speed | Medium | Medium | Generated Coq is ~30K lines. Use `_CoqProject` with selective compilation. Develop proofs incrementally. Cloud server with 16GB RAM for parallel coqc. |
| CKB-VM Rust semantics diverge from Coq model | Medium | Medium | Differential testing catches runtime divergences. Cross-reference Coq model against actual Rust source. Document any abstraction gaps. |
| riscv-tests ECALL convention incompatible with CKB-VM | Medium | Low | Already identified and designed for: CKB-VM uses syscall 93, riscv-tests use `tohost`. Sail runner filters ECALL divergence. |
| Insufficient time for 10+ instruction proofs | Medium | Medium | Prioritize core ALU instructions (ADD, SUB, ADDI) which have the simplest proof structure. Build reusable tactics early. Accept partial proofs (documented Admits) for complex instructions (SW, JALR). |
| Sail C++ emulator build failures on target platform | Low | Low | CMake build is well-documented in sail-riscv. Provide `build_sail_emulator.sh` with error handling. GMP is the only non-trivial dependency. |
| Generated Coq requires unavailable Coq libraries | Low | Medium | Use `coq-sail-stdpp` as documented by sail-riscv project. Pin exact Coq version (9.0.0) in build instructions. |
### 11. Transparency Commitments
**Fully open-source from Day 1.** All code on GitHub under MIT license, public repository from the start of development.
**Weekly progress updates.** Posted on Nervos Talk forum every week, covering completed tasks, blockers, and next-week plan.
**Monthly sharing sessions.** Two sessions total (Week 4 and Week 8), the final session includes a live demo of Coq proof compilation and differential test execution, plus Q&A.
**Machine-verifiable results.** All Coq proofs are machine-checked — anyone can clone the repo, run `make coq`, and independently verify every theorem. No trust in the author required.
**Honest reporting of limitations.** The completion report will explicitly document: number of `Admitted` proofs and why, semantic gaps that prevent full equivalence claims, and instructions not yet covered.
**Reproducible from scratch.** Complete build instructions, pinned dependencies, and setup scripts ensure any reviewer can reproduce all results independently.
-–
-–
## 中文版本
### 一、项目名称与简介
**项目名称:** ckb-vm-sail-verify
**一句话简介:** 基于 Sail RISC-V 官方规范与 Coq 定理证明器,形式化验证 CKB-VM 的 RISC-V 指令执行语义与标准规范的数学等价性,辅以差分测试进行双重验证。
### 二、团队/个人介绍
**申请人:** Tinyueng([GitHub](TinyuengKwan (Tinyueng) · GitHub))
**核心能力:** 目前在 PLCT Lab(中科院软件所)****Sail & ACT(RISC-V Architectural Certification Tests)小队****实习,对 Sail 体系结构定义语言和 RISC-V 一致性测试有直接、深入的实践经验。参与 [sail-riscv](GitHub - riscv/sail-riscv: Sail RISC-V model · GitHub) 生态贡献,并独立开发了 [sail-lsp](GitHub - TinyuengKwan/sail-lsp · GitHub)——Sail 语言的 Language Server Protocol 实现。熟悉 Sail 编译器工具链(包括用于定理证明器集成的 `–coq` 后端)、sail-riscv 形式模型结构以及 RISC-V 架构认证测试(ACT)框架。具备 Rust 系统编程能力(所有权、生命周期、trait 系统)。系统学习过 CS:APP,对处理器架构、虚拟内存、ELF 链接、系统级 I/O 有完整知识框架。此外还学习过编译原理、x86 汇编等内容。
**相关领域知识:** 对 Sail 形式规范语言有深度了解——不仅是使用者,更是工具链贡献者(sail-lsp)。具有 RISC-V 架构认证测试(ACT)的实践经验,理解如何基于 Sail 参考模型验证一致性。熟悉 RISC-V ISA 体系结构(RV64IMAC、特权级、扩展机制)。具有 Coq 交互式定理证明器和 Sail Coq 后端输出的使用经验。通过源码分析深入理解 CKB-VM 内部机制(nervosnetwork/ckb-vm),涵盖其版本化指令语义、MOP(宏操作融合)扩展和平坦内存模型。
### 三、问题描述
CKB-VM 是驱动 Nervos CKB 网络所有链上计算的 RISC-V 虚拟机。每一笔 CKB 交易——每一个 lock script、type script 和智能合约——都在 CKB-VM 中执行。其正确性是整条链的安全基石。然而目前,没有任何形式化证明表明 CKB-VM 忠实地实现了 RISC-V 规范。
**指令语义缺乏形式化验证。** CKB-VM 实现了横跨 RV64I、M、C(Zca)、B(Zba/Zbb/Zbc/Zbs)、A 和自定义 MOP 扩展的 158 条操作码。每条指令都是手工编写的 Rust 代码。尽管实现质量高且经过实战考验,但不存在任何数学证明表明其中任何一条指令的行为与 RISC-V 规范完全一致。这是 [nervosnetwork/ckb-vm#190](Formally Verify CKB-VM via Sail · Issue #190 · nervosnetwork/ckb-vm · GitHub),自 2021 年开放至今。
**测试无法保证正确性。** CKB-VM 通过了标准 riscv-tests 测试套件,但测试套件是有限的——无法覆盖所有可能的输入状态、边缘情况或极端条件下的交互。一个微妙的语义偏差(例如,在某个罕见操作数组合上的符号扩展错误)可能多年未被发现,直到被利用。
**缺乏权威的比对基准。** RISC-V ISA 规范以自然语言(英文散文)编写,天然存在歧义。不同实现者可能对同一句话有不同理解。没有机器可读的形式规范作为参考,"正确性"仍然是人为判断。
**Sail RISC-V 解决了基准问题。** RISC-V International 已采纳 Sail 作为 RISC-V 的官方形式规范语言。[sail-riscv](GitHub - riscv/sail-riscv: Sail RISC-V model · GitHub) 模型是 RISC-V 指令语义的机器可读、可执行、数学精确的定义。关键在于,Sail 可以编译为 Coq——使得直接针对官方规范进行定理证明成为可能。
**CKB-VM 与标准 RISC-V 之间存在语义差距。** CKB-VM 不是通用的 RISC-V 实现。它拥有平坦的 4MB 内存模型(无 MMU/虚拟内存)、自定义 ECALL 处理(syscall 93 退出,非 M 模式陷阱)、三种 VERSION 模式(各有不同行为语义)、逐指令周期计数以及标准规范中没有对应物的 MOP 融合指令。这些差距必须被形式化地记录,并精确表征其对等价性的影响。
**实际影响:** 经过形式化验证的 CKB-VM 将为 CKB 的执行层提供最强的正确性保证。它将消除一整类潜在漏洞——指令级语义缺陷——这是任何数量的测试都无法完全防止的。对于一条安全模型完全依赖确定性执行的区块链而言,这是一项基础性投资。
### 四、解决方案
#### 4.1 核心思路
ckb-vm-sail-verify 采用双重验证策略:****Coq 形式化证明**提供数学确定性,**差分测试****提供实际运行时验证。两种方法互补——形式化证明覆盖已证明指令的所有可能输入,差分测试则用具体测试向量验证完整指令集。
#### 4.2 为什么选择 Sail + Coq
**Sail 是官方 RISC-V 规范。** 与非正式的散文规范不同,Sail 是可执行的、无歧义的,由 RISC-V International 维护。以 Sail 作为基准意味着我们针对硬件厂商使用的同一规范进行验证。
**Sail 可编译为 Coq。** Sail 编译器的 `–coq` 后端从 Sail 源码生成 Coq 定义。这使我们在 Coq——CompCert(经验证的 C 编译器)和 seL4(经验证的操作系统内核)使用的同一定理证明器——中获得了机器可检查的 RISC-V 语义。
**Coq 证明是全称的。** 一旦 Coq 定理被证明,它对所有可能的输入成立——而不仅仅是测试向量。一条被证明的 `ADD` 指令对每一种源寄存器和机器状态的组合都是正确的,无条件地。
**可复现且可审计。** Coq 证明由机器检查。任何人都可以重新运行 `make coq` 并独立验证每一个定理。不需要信任证明者——只需信任 Coq 的内核,这是现存最受审视的软件之一。
#### 4.3 四层验证架构
**Layer 1:Sail RISC-V 形式规范(基准真相)**
官方 sail-riscv 模型,通过 `sail --coq` 编译为 Coq。生成约 30,000 行 Coq 代码,以单子状态转换器风格定义完整的 RV64 指令语义。这是权威参考——RISC-V **应该**做什么。
**Layer 2:CKB-VM Coq 模型(验证目标)**
手工编写的 CKB-VM Rust 解释器逻辑的 Coq 模型,捕获寄存器文件操作(32 × 64 位,x0 硬连线为零)、PC 推进、ALU 计算、内存访问(小端序、平坦 4MB)和分支/跳转语义。基于 ckb-vm `execute.rs` 中的 `handle_*` 函数。这是 CKB-VM **实际**做什么。
**Layer 3:等价性证明(数学桥梁)**
对每条目标指令,证明:`ckb_vm_execute(I, state) = sail_execute(I, state)`。每个证明分解为三个子定理:(a) **语义**——目标寄存器具有正确的计算值;(b) **PC 更新**——下一个 PC 是 PC+4(或分支/跳转的 PC+偏移量);(c) **寄存器隔离**——所有其他寄存器不变。证明技术:展开定义 → 用辅助引理改写 → 用 `lia`/`reflexivity` 求解。
**Layer 4:差分测试(运行时验证)**
在 CKB-VM(Rust)和 Sail C++ 模拟器上执行相同的 RISC-V ELF 二进制文件,然后逐步比较执行轨迹——PC 值、寄存器状态和退出码。通过覆盖完整指令集(包括尚未被形式化证明的指令)和捕获 Coq 模型中可能不可见的实现缺陷来补充形式化证明。
#### 4.4 CKB-VM 特定考量
| 方面 | CKB-VM 行为 | Sail RISC-V 行为 | 处理方式 |
|------|------------|-----------------|---------|
| x0 寄存器 | 写入后清零 | 静默丢弃写入 | 功能等价——在 Coq 中证明 |
| ECALL | 通过 A7 分发(syscall 93 = 退出) | 陷入 M 模式处理器 | 差分测试中过滤;排除在 Coq 范围外 |
| 内存模型 | 平坦 4MB,无 MMU,W^X | Sv39/48/57 完整 MMU | CKB-VM 简化模型;在 4MB 子集内证明 |
| FENCE | 空操作(单线程) | 完整栅栏语义 | 单核模型中无可观测差异 |
| 原子操作(A 扩展) | 单地址保留 | 保留集 | 有限并发;证明单核子集 |
| VERSION 模式 | 三种模式(0/1/2) | 无 | 仅针对 VERSION2 |
| 周期计数 | 逐指令成本追踪 | 无 | 过滤;不属于体系结构状态 |
| MOP 融合 | WIDE_MUL、FAR_JUMP_REL、ADC | 无对应物 | 验证为标准指令的组合 |
### 五、详细技术实现计划
#### 5.1 Phase 0:环境搭建与指令映射(Week 1)
安装完整工具链:Sail(>= 0.20.x,通过 OPAM)、Coq(9.0.0 + coq-sail-stdpp)、CMake、Rust(1.92.0+)、GMP、RISC-V 交叉编译器(`gcc-riscv64-unknown-elf`)。克隆 sail-riscv 作为 git 子模块。构建 Rust 工作空间(`lib/` + `crates/diff-test/`)。通过 `sail --coq --dcoq-undef-axioms` 从 Sail 生成 Coq。构建 Sail C++ 模拟器用于差分测试。
创建完整的 CKB-VM 指令映射表:将所有 158 条 CKB-VM 操作码映射到其 Sail RISC-V 对应项,按扩展分类(I/M/C/A/B/MOP),识别无 Sail 对应项的指令(MOP 融合操作)。
#### 5.2 Phase 1:差分测试框架(Week 2)
实现 diff-test CLI 工具,支持逐步执行轨迹比较:
**CKB-VM 端(`lib/src/runner.rs`):** 将 CKB-VM 包装为逐步执行器,在每条指令后捕获 `StepState`(PC + 32 个寄存器)。处理退出系统调用(通过 A7 寄存器的 syscall 93)。支持 `–max-steps` 防止无限循环。
**Sail 端(`crates/diff-test/src/sail_runner.rs`):** 解析 Sail C++ 模拟器的轨迹输出格式。逐步提取 PC 和寄存器值。处理 ECALL 差异(riscv-tests 使用 `tohost` 内存映射 I/O;CKB-VM 使用系统调用)。
**比较引擎(`lib/src/state.rs`):** 带 `first_mismatch` 检测的 `CompareResult`——报告轨迹分歧的精确步骤、PC 和寄存器。支持 `–verbose` 模式(完整轨迹转储)和 `–json`(机器可读输出)。
**测试语料库:** 自动发现来自 riscv-tests(rv64ui、rv64um、rv64uc、rv64ua)、riscv-arch-test 和 CKB-VM 自有测试套件的 ELF 测试工件。
#### 5.3 Phase 2:Coq 基础设施与 Sail 接口(Week 3)
这是最具挑战性的阶段——桥接 Sail 的单子 Coq 输出与 CKB-VM 的纯模型。
**研究生成的 Coq(约 30,000 行):** 理解 Sail 的状态单子编码、寄存器访问模式、位向量算术库和指令解码/执行结构。识别每条 RISC-V 指令对应的 Coq 函数(如 `execute_RISCV_ADD`)。
**定义状态等价关系:** 形式化 CKB-VM 的 `machine_state` 记录与 Sail 单子状态之间的映射。处理类型不匹配(Sail 使用位向量;CKB-VM 模型使用 Z 整数)。
**构建证明基础设施:** 创建可重用的 Coq 策略(`solve_truncate`、`simplify_regs`、`unfold_sail_step`),自动化常见证明模式。用最小导入测试:`Check execute_RISCV_ADD`。
#### 5.4 Phase 3:核心 ALU 指令证明(Week 4)
证明 7 条核心指令,每条提供完整的三定理覆盖(语义 + PC + 寄存器隔离):
| 指令 | 类型 | 关键证明挑战 |
|------|------|------------|
| ADD | R 型 | `truncate_64` 幂等性,x0 硬连线 |
| SUB | R 型 | 无符号减法 mod 2^64 |
| ADDI | I 型 | 12 位立即数的符号扩展 |
| SLLI | 移位 | 移位量掩码(RV64 取低 6 位) |
| SRLI | 移位 | 逻辑右移与算术右移的区分 |
| SRAI | 移位 | 算术移位中符号位的保持 |
| MUL | M 扩展 | 64 位乘法,128 位乘积的低 64 位 |
构建可重用引理库:`truncate_64_idempotent`、`x0_always_zero`、`get_set_reg_same`、`get_set_reg_diff`、`sign_extend_properties`。
#### 5.5 Phase 4:控制流与内存证明(Week 5)
**分支指令:** BEQ 的跳转路径(PC += offset)和不跳转路径(PC += 4)。处理符号扩展的分支偏移量。
**跳转指令:** JAL 的链接地址(rd = PC + 4)和跳转目标(PC = PC + offset)。JALR 的间接跳转(LSB 清零)。
**内存指令:** 构建内存引理——存取往返(`load(store(addr, val)) = val`)、字节序(小端)、对齐。证明 LW(符号扩展的 32 位加载)和 SW(32 位存储)。
#### 5.6 Phase 5:扩展证明与 MOP 验证(Week 6)
**额外 ALU 证明:** AND、OR、XOR、SLTI、SLTIU、LUI、AUIPC。
**MOP 融合验证:** 验证 `WIDE_MUL(rd1, rd2, rs1, rs2)` = 顺序执行 `MULH(rd1, rs1, rs2); MUL(rd2, rs1, rs2)`。验证 `FAR_JUMP_REL(rd, offset)` = `AUIPC(rd, upper); JALR(rd, rd, lower)`。
**证明审计:** 审查所有 `Admitted` 引理。尽可能关闭。记录剩余 admits 及明确理由。
**目标:** 10+ 条核心指令完全证明(零 Admitted),5+ 条额外指令具有部分证明。
#### 5.7 Phase 6:完整差分测试覆盖与语义差距分析(Week 7)
**分扩展测试:** 在 rv64ui(整数)、rv64um(乘除法)、rv64uc(压缩指令)、rv64ua(原子操作)测试套件上运行差分测试。添加 `–extension` CLI 过滤器。
**边缘情况测试:** 除以零行为、最大移位量、整数溢出/下溢、非对齐内存访问、x0 写入尝试。
**语义差距文档化:** 生成正式的差距分析文档,覆盖所有 8 个已识别的语义差距,精确表征每个差距对等价性声明的影响。
**指令覆盖矩阵:** 更新 158 条操作码映射表,标注证明状态(已证明 / 部分证明 / 仅差分测试 / 未覆盖)和测试状态(通过 / 失败 / 跳过 / 不适用)。
#### 5.8 Phase 7:文档、交付物与路线图(Week 8)
**洁净室构建测试:** 从头克隆,`make all` 成功,所有 Coq 证明编译通过,差分测试通过。
**结项报告:** 已证明定理数量、完全覆盖的指令数量、通过的差分测试数量、完整差距分析、剩余 `Admitted` 证明及理由。
**Phase 2+ 路线图:** 完整 RV64I 覆盖(约 50 条指令)、M/C/B 扩展证明、ASM 模式验证(需要 Islaris 级工具,约 6+ 个月,$20k+ 规模)。
**代码质量:** `cargo fmt`、`cargo clippy`、移除临时文件、确保 CI 就绪。
### 六、预期交付成果
#### 6.1 核心交付物
1. **Coq 证明库(coq/)** —— 10+ 条核心 RISC-V 指令的形式化等价性定理,覆盖 RV64I ALU 操作、移位、分支、跳转及至少一条 M 扩展指令。每条指令包含三个子定理(语义、PC 更新、寄存器隔离)。所有证明由 Coq 机器检查——零信任需求。
2. **差分测试框架(crates/diff-test/)** —— 在 CKB-VM 和 Sail C++ 模拟器上执行相同 ELF 二进制文件并逐步比较执行轨迹的 CLI 工具。支持 `–verbose`、`–json`、`–max-steps`、`–test-dir` 参数。报告首个分歧点及完整状态转储。
3. **CKB-VM 指令映射文档** —— 所有 158 条 CKB-VM 操作码到 Sail RISC-V 对应项的完整映射,按扩展分类,标注证明/测试状态。
4. **形式化语义差距分析** —— 严格记录 CKB-VM 与标准 RISC-V 之间所有 8 个已识别语义差距,精确表征对等价性声明的影响及缓解策略。
5. **方法论文档(doc/)** —— 架构指南和证明方法论文档,使未来贡献者能够按照既定模式为额外指令添加证明,无需重新学习整个框架。
6. **源码可复现构建** —— 单条 `make all` 命令构建全部内容(Coq 证明 + Sail 模拟器 + Rust 差分测试工具)。文档化的工具链要求和安装脚本。
7. **Phase 2+ 路线图** —— 扩展至完整 RV64I、M/C/B 扩展和 ASM 模式验证的详细计划,适用于 Community Fund DAO 规模的提案。
#### 6.2 验收标准
##### 形式化验证标准
- V-1:`make coq` 在洁净环境中编译所有 Coq 证明,无错误或警告。
- V-2:至少 10 条 RISC-V 指令具有完整的三定理证明(语义 + PC + 隔离),零 `Admitted`。
- V-3:至少一条 M 扩展指令(MUL)被形式化证明。
- V-4:所有辅助引理(`truncate_64_idempotent`、`x0_always_zero`、`get_set_reg_*`)完全证明。
- V-5:状态等价关系被形式化定义并在所有证明中一致使用。
##### 差分测试标准
- T-1:差分测试在完整 rv64ui 测试套件(55 个测试)上通过。
- T-2:差分测试在 rv64um 测试套件(13 个测试)上通过。
- T-3:diff-test 工具正确检测并报告注入的语义偏差(负面测试)。
- T-4:`–json` 输出为 `jq` 可解析的有效 JSON。
- T-5:边缘情况测试(除以零、最大移位、溢出)产生匹配的轨迹。
##### 文档标准
- D-1:完整的 158 条操作码映射表,标注证明/测试状态。
- D-2:语义差距分析覆盖所有 8 个已识别差距及影响评估。
- D-3:洁净室构建测试:全新克隆 → `make all` 成功 → `make test` 通过。
- D-4:中英双语 README,含快速入门说明。
### 七、所需资金及用途说明
**申请总额:** 1,000 USD
**支付方式:** 100% CKB
| 类别 | 金额 | 说明 |
|------|------|------|
| 云服务器 | $350 USD | 1 台 VPS(Linux,≥ 4 核 16GB 内存),用于 Coq 编译(内存密集型)、Sail 模拟器构建和差分测试执行。8 周使用。 |
| 开发者补贴 | $450 USD | 核心开发工作。预计每周 20–30 小时,共 8 周。涵盖 Coq 证明开发、Rust 差分测试框架和 Sail 集成。 |
| 文档与社区 | $200 USD | 中英双语文档编写、架构图制作、2 次月度分享会材料、结项报告和 Phase 2+ 路线图准备。 |
### 八、预计完成时间
**总周期:** 8 周(约 2 个月)
#### 第一阶段:基础设施与可行性验证(Week 1–3)
**Week 1:** 完成工具链安装(Sail + Coq + CMake + Rust + RISC-V 交叉编译器)。克隆 sail-riscv 子模块。从 Sail 生成 Coq。构建 Sail C++ 模拟器。构建 Rust 工作空间。创建 158 条操作码指令映射表。
**Week 2:** 实现差分测试框架——CKB-VM 逐步执行器、Sail 轨迹解析器、比较引擎、支持 `–verbose`/`–json`/`–max-steps` 的 CLI 工具。处理 riscv-tests 与 CKB-VM 之间的 ECALL 差异。在 rv64ui 套件上运行初始差分测试。
**Week 3:** 研究 Sail 生成的 Coq(约 30,000 行)。定义状态等价关系。构建证明基础设施和可重用策略。创建最小测试导入(`Check execute_RISCV_ADD`)。记录 Sail Coq 桥接策略。**最具挑战性的一周**——调试导入、名称冲突、库依赖。
**里程碑 1(Week 3 末):** 差分测试框架可运行,rv64ui 测试通过。Coq 基础设施可针对 Sail 生成的定义进行编译。指令映射表完成。
#### 第二阶段:形式化证明(Week 4–6)
**Week 4:** 证明 7 条核心 ALU 指令(ADD、SUB、ADDI、SLLI、SRLI、SRAI、MUL),每条提供完整的三定理覆盖。构建可重用引理库。
**Week 5:** 证明控制流指令(BEQ 跳转/不跳转、JAL 链接+跳转)。构建内存引理。证明 LW、SW。提交中期进度报告。
**Week 6:** 证明额外 ALU 指令(AND、OR、XOR、SLTI、JALR)。验证 MOP 融合等价性(WIDE_MUL = MULH + MUL)。审计所有 `Admitted` 证明。
**里程碑 2(Week 6 末):** 10+ 条指令具有完整证明。MOP 融合已验证。中期报告已提交。
#### 第三阶段:集成与交付(Week 7–8)
**Week 7:** 在 rv64ui/rv64um/rv64uc/rv64ua 上进行完整差分测试覆盖。边缘情况测试。生成正式语义差距分析文档。更新指令覆盖矩阵。
**Week 8:** 洁净室构建测试。中英双语文档。代码清理(fmt、clippy)。结项报告。Phase 2+ 路线图。社区分享会。最终提交。
**里程碑 3(Week 8 末):** 所有交付物提交——Coq 证明库、差分测试框架、指令映射、差距分析、方法论文档和路线图。
#### 时间线总览
| 阶段 | 周次 | 重点 | 里程碑 |
|------|------|------|--------|
| 第一阶段:基础设施 | Week 1–3 | 工具链、差分测试框架、Coq 基础设施、指令映射 | 里程碑 1 |
| 第二阶段:形式化证明 | Week 4–6 | 核心 ALU 证明、控制流、内存、MOP、证明审计 | 里程碑 2(Week 6) |
| 第三阶段:集成与交付 | Week 7–8 | 完整测试覆盖、差距分析、文档、洁净室构建、提交 | 里程碑 3 |
### 九、与 CKB 生态的关联性
**回应长期开放的社区需求。** [ckb-vm#190](Formally Verify CKB-VM via Sail · Issue #190 · nervosnetwork/ckb-vm · GitHub) 自 2021 年开放至今,请求对 CKB-VM 的 RISC-V 实现进行形式化验证。本项目直接交付概念验证解决方案。
**加固 CKB 的安全基石。** CKB-VM 执行 Nervos 网络上的每一笔交易。经过形式化验证的指令集消除了一整类潜在漏洞——测试无法完全防止的指令级语义缺陷。这对于安全模型依赖确定性执行的区块链而言至关重要。
**建立可复用的验证框架。** Coq 证明基础设施、差分测试工具和方法论文档均为扩展而设计。未来贡献者可按照既定模式为额外指令添加证明,无需重新学习框架。
**展示 CKB 的技术纵深。** 区块链虚拟机的形式化验证在行业中罕见。CompCert 验证了 C 编译器;seL4 验证了操作系统内核;本项目将同等严谨性带入 CKB 的执行层。这将 CKB 定位于技术最具雄心的区块链项目之列。
**为未来全 ISA 验证铺路。** 本 PoC 覆盖 10+ 条指令并建立方法论。Phase 2+ 路线图(完整 RV64I、扩展、ASM 模式)为 Community Fund DAO 提案规模的全面验证提供了清晰路径。
**纯 Rust + Coq 工具链。** 差分测试框架为纯 Rust,与 CKB 技术栈一致。CKB 社区的 Rust 开发者无需学习新语言即可贡献(Coq 证明是专业组件,但测试框架对所有人开放)。
### 十、技术风险与应对
| 风险 | 影响 | 概率 | 应对 |
|------|------|------|------|
| Sail Coq 输出结构在版本间变化 | 高 | 低 | 锁定 sail-riscv 到特定 commit。记录精确的 Sail 编译器版本。提供 `generate_coq.sh` 用于可复现的重新生成。 |
| Sail 单子 Coq 过于复杂,难以与 CKB-VM 纯模型桥接 | 高 | 中 | Week 3 专门用于此任务。退路:提取执行后的可观测状态而非证明步骤内部等价性。最坏情况:针对简化的 Sail 提取物而非原始单子输出进行证明。 |
| Coq 编译时间超出开发迭代速度 | 中 | 中 | 生成的 Coq 约 30K 行。使用 `_CoqProject` 进行选择性编译。增量开发证明。16GB 内存的云服务器支持并行 coqc。 |
| CKB-VM Rust 语义与 Coq 模型偏差 | 中 | 中 | 差分测试捕获运行时偏差。将 Coq 模型与实际 Rust 源码交叉参照。记录所有抽象差距。 |
| riscv-tests ECALL 约定与 CKB-VM 不兼容 | 中 | 低 | 已识别并已设计解决方案:CKB-VM 使用 syscall 93,riscv-tests 使用 `tohost`。Sail runner 过滤 ECALL 差异。 |
| 时间不足以完成 10+ 条指令证明 | 中 | 中 | 优先处理证明结构最简单的核心 ALU 指令(ADD、SUB、ADDI)。尽早构建可重用策略。接受复杂指令(SW、JALR)的部分证明(记录 Admits)。 |
| Sail C++ 模拟器在目标平台上构建失败 | 低 | 低 | CMake 构建在 sail-riscv 中有良好文档。提供带错误处理的 `build_sail_emulator.sh`。GMP 是唯一非平凡依赖。 |
| 生成的 Coq 需要不可用的 Coq 库 | 低 | 中 | 按 sail-riscv 项目文档使用 `coq-sail-stdpp`。在构建说明中锁定精确的 Coq 版本(9.0.0)。 |
### 十一、透明度承诺
**Day 1 起完全开源。** 所有代码在 GitHub 上以 MIT 许可证公开,从开发伊始即为公开仓库。
**每周进度更新。** 每周在 Nervos Talk 论坛发布,涵盖已完成任务、阻塞项和下周计划。
**月度分享会。** 共两次(Week 4 和 Week 8),最后一次包括 Coq 证明编译和差分测试执行的实时演示及问答环节。
**机器可验证的结果。** 所有 Coq 证明由机器检查——任何人都可以克隆仓库、运行 `make coq`,独立验证每一个定理。不需要信任作者。
**如实报告局限性。** 结项报告将明确记录:`Admitted` 证明的数量及原因、阻碍完整等价性声明的语义差距、以及尚未覆盖的指令。
**从头可复现。** 完整的构建说明、锁定的依赖版本和安装脚本,确保任何评审者都能独立复现所有结果。