A privacy-preserving, decentralised machine learning system for identifying non-human wallet activity and suspicious wallets at scale
Spark Grant Request: $1,500
Duration: 1 month + 2 week buffer (6 weeks total)
1. Overview
We are building a collaborative, federated machine learning system that monitors and classifies on-chain wallet behaviour on the Nervos CKB blockchain. Its core purpose is to distinguish between wallets operated by genuine human users and those driven by automated bots, DEX trading agents, or exchange-related non-human activity and later on labelling wallets used in fraudulent activities without any organisation needing to expose their raw user data to a central party.
Current status: We have already started the research and product development. A small working prototype exists. This Spark grant will rapidly accelerate the work enhancing the model, onboarding partners, building reliable dashboards, collecting labeled data, and releasing a production-ready system that any CKB user or wallet provider or generally any team can access.
Timeline: 1 month of focused work with a 2-week buffer for testing and iteration (6 weeks total).
2. Current Status & What We Have Built
Completed work
-
Research & architecture design
Federated learning architecture for wallet classification -
Flower federated server
FedProx aggregation, model distribution -
Flower client implementation
Local training, weight update submission -
FedProx strategy integration
Proximal term for heterogeneous data -
Local training pipeline
PyTorch-based classifier training -
Model inference
Query endpoint for wallet classification -
Initial feature research
Temporal and graph feature definitionsdatasets
Milestone + Budget
week 1 - $200 | CKB-CCC + Explorer API integration
Connect to CKB mainnet/testnet RPC via CKB-CCC, fetch transaction history for any lock hash, pull enriched data from Explorer API (timestamps, counterparties, Cell consumption patterns)
Week 2 - $250 | Feature extraction pipeline + continued data curation
Extract temporal features (inter-transaction intervals, burst detection, regularity scores) and graph features (centrality, clustering, flow ratios). Continue curating a goal of 500 human + 500 non-human labeled wallets, we already trained the model with 100 non-human wallet.
Week 3 - $300 | Complete labeled dataset + global model training
Finalize 500 human + 500 non-human wallet dataset. Train FedProx global model on CKB data, validate accuracy (>75% target), produce ~100MB model weights
Week 4 - $350 | Production web UI + finetuner dashboard + partner integrations
Launch complete web UI with wallet classification, whitelisted finetuner dashboard for training submissions, Mosaic Africa integration, API endpoint for any wallet provider (JoyID, Neuron, etc.)
Week 5 - $250 | Testing + iteration + onboarding
Cross-browser QA, mobile responsiveness, accuracy improvements, bug fixes, user feedback collection, onboard 2+ additional projects
Week 6 - $150 | Public release + documentation + completion
Open source launch (MIT license), API documentation, live demo page, demo video, completion report
3. Problem Statement
Blockchain networks are inherently transparent, yet identifying whether a wallet is controlled by a real human trader, an automated trading bot, or an exchange wallet remains a difficult challenge.
Organisations building on CKB currently lack a shared, privacy-preserving mechanism to detect such behaviour. A DEX cannot easily tell if a trader is human or bot. An airdrop platform cannot easily filter Sybil wallets. A wallet provider cannot easily warn users about suspicious counterparties.
Training a centralised detection model would require pooling sensitive user data across organisations which is a privacy and compliance risk. At the same time, any single organisation’s data is unlikely to capture the full breadth of abnormal human behaviour seen across the entire network.
4. Proposed Solution
We introduces a federated learning framework , built on Flower (flwr) , in which each participating organisation contributes to improving the shared model without exposing their raw user data.
Key innovation for user experience: Participating organisations do NOT need to download the model or run complex infrastructure. Everything is abstracted behind a simple web UI.
5. How it works for participating organisations (Partner organisations finetuning)
The organisation only needs to:
-
Get whitelisted as a finetuner
-
Log into the web UI
-
Submit wallet addresses with labels (human/non-human)
-
Click “Start Training”
Everything else is an internal workflow load and abstracted away.
6. How it works for inference-only users (no training)
No model download required. No infrastructure setup. Just use the UI or API.
7. Example integration: A Randomn wallet
**8. Inference output: Binary classification with confidence
The model generates a binary output along with a confidence score. An output of 1, for example with 60% confidence, indicates that the behaviour is consistent with genuine human trading. Conversely, an output of 0, for example with 30% confidence, signifies that the behaviour matches patterns typical of bots, DEX agents, or exchanges.
9. Continuous improvement through federated learning
Once the global model is deployed, whitelisted finetuners can:
-
Submit additional labeled wallet addresses
-
Trigger local training rounds
-
Contribute weight updates to improve the global model
All organisations and wallet providers benefit from improvements, even if they never train.
10. Input Features
The model learns from two primary categories of on-chain signals derived from CKB transaction records.
Transaction frequency and timing
Human patterns are characterized by irregular, variable inter-transaction intervals, occasional and context-driven burst patterns, time-of-day distribution that aligns with human hours (which varies by region), and a random transaction cadence. In contrast, non-human patterns feature regular, deterministic inter-transaction intervals, frequent and sustained burst patterns, 24/7 activity regardless of time of day, and a periodic transaction cadence, such as every 60 seconds.
Wallet graph relationships and address patterns
Human patterns show a diverse and irregular interaction graph, with mixed send and receive flow directionality, and low graph centrality. Non-human patterns, on the other hand, are characterized by star topologies, chains, or tight clusters in the interaction graph, primarily one-directional flow, and high graph centrality, meaning a wallet acts as a hub for many other wallets
11. Technical Implementation
Data extraction on CKB
We use two primary data sources for feature extraction:
CKB-CCC , historical data endpoint
CKB Explorer API, transactions and wallet endpoint:
Model size and infrastructure
The global model size is approximately ~100 MB. Inference does not require any download, as the user interface or API abstracts the model away. Training is performed on backend server hardware, not on user devices. As a result, user requirements are minimal , only a web browser is needed.
12. User Interface & Experience
Training UI (for whitelisted finetuners only)
13. Justification for $1,500:
-
Two specialized developers ( Backend Developer + ML Engineer)
-
Existing work already started (research + prototype)
-
Dataset curation effort (500 human + 500 non-human wallets)
-
API development for any wallet provider to integrate
-
Post-grant commitment to onboarding 3+ projects
-
1 month rapid execution + 2 week buffer for quality
14. Team
We are two developers with complementary expertise in blockchain infrastructure, applied machine learning, and production systems. We’ve already built a working prototype, validated the approach, and are now moving toward a full production release.
Fadhil Mulinya - Backend & Blockchain Developer
- Experienced in CKB as a nervos catalyst program participant, experienced in CKB-CCC, RPC interfaces, and transaction data modelling.
- Built the current node data extraction pipeline and FL client-server integration.
- Focuses on reliable, production-grade infrastructure that works for real partners.
Paul Wako - Machine Learning Engineer
- Specialises in privacy-preserving ML, federated learning (Flower / PyTorch), and feature engineering from sequential data.
- Designed the current model architecture, FedProx training loop, and feature extraction logic.
- Ensures the model is lightweight, explainable, and practical for real-world wallet classification.
Project GitHub



