Introducing Infern: serve an AI model from your own machine and get paid per request over Fiber

Hello,

I have been building Infern, a compute marketplace where an individual can serve an AI model from their own hardware and get paid per request over Fiber. It is open source, running on testnet today, and I would love feedback from this community before I take it further.

The short version: a provider points Infern at a model they are already running, sets a price, and starts earning. A consumer calls that model with a normal OpenAI style request and pays per request over a Fiber payment channel. CKB holds the shared, trusted state (who serves what, at what price, with what reputation and stake). Fiber moves the money. The inference itself never touches the chain.

GitHub: GitHub - truthixify/infern: Infern — a compute marketplace where individuals serve AI models from their own hardware and get paid per request over CKB Fiber. · GitHub

The problem

A lot of people already run models. A Mac with a decent GPU, a single rented GPU box, a fine tuned model for one language or one domain. What none of them have is a way to charge strangers per request. Web2 billing means an account, a company, a card processor, and a minimum scale the long tail will never reach. So all of that spare capacity and all of those niche models sit unused, or get given away for free.

This is a payment problem, not a compute problem. Inference does not need a blockchain. An individual selling inference to the world needs a way to collect many small payments from anyone, with no accounts and no chargebacks, and that is exactly what CKB plus Fiber is good at.

Infern is for the long tail: home operators, people who self deploy on rented hardware, and fine tuners with a niche model and no infrastructure. It is not trying to serve large labs, they already have billing and no reason to settle through a chain.

What Infern does

For a provider:

  • Point the agent at a model server you already run (vLLM, llama.cpp, Ollama, LM Studio, anything that speaks the OpenAI chat API).

  • Register a listing on CKB: the model id, your price, and your capabilities.

  • Keep a Fiber node reachable so you can receive payments.

  • Post a stake that can be slashed if you misbehave.

  • Start earning per request.

For a consumer:

  • Call a specific provider directly, or name a model class and let a router pick a live provider for you.

  • Pay per request. The wire format is the standard inference API, so existing clients and SDKs work with no Infern specific code.

  • Optionally use a capped free tier gated by identity, with no payment rail at all, as an onramp.

How payment works (F402)

The core idea is a scheme I call F402, which is HTTP 402 Payment Required wired to Fiber.

When a consumer sends a request without payment, the provider or router answers with a 402 that carries a Fiber invoice. The consumer pays that invoice over a payment channel, retries the request with proof of payment, and the provider serves the response. Because the payment moves inside a Fiber channel and not as an on chain transaction, each request settles in milliseconds and costs a fraction of a CKB, which is what makes per request pricing actually practical.

There is more than one settlement path, in order of how smooth they are for the consumer:

  • Prepaid balance. Deposit once into a balance cell or open one channel to a router, then each request just debits the balance and the server responds immediately with no per request round trip.

  • Atomic multi hop. The consumer pays the provider through a router as a Fiber hop, with a hashed timelock so the provider is paid only on delivery and the router never custodies funds.

  • Free tier. Verified identity, a capped allowance, no payment at all.

A consumer does not open a channel to every provider. They connect to a router and reach the whole network through it, the same way Lightning style routing works.

Why CKB and Fiber

The split is clean:

  • CKB holds what must be shared and trusted: a model provenance registry, provider and listing cells with price and capabilities, stake cells with slashing, optional balance cells, and reputation that anyone can read or derive from on chain events. The cell model fits this well, since each listing and stake is a cell the owner controls.

  • Fiber moves the money: many tiny payments per second, off chain, final, with no per request gas.

  • Inference stays off chain on the provider’s own hardware, where it belongs.

CKB does the part it is uniquely good at, a small shared state of record, and Fiber does the part it is uniquely good at, high frequency micropayments. Neither tries to do the compute.

How it is built

It is a monorepo of TypeScript services plus Rust on chain scripts.

  • Provider agent. HTTP server that does F402 verification, proxies the model server, and registers on chain.

  • Router. Stateless relay that selects a live provider, relays the request, settles payment, and fails over.

  • Indexer. Scans the chain, serves a directory and reputation API, and streams live updates.

  • Free tier service. Identity challenge and response, quotas, treasury settlement.

  • Checks. Liveness, inference, and honesty probes, reputation scoring, slashing monitor.

  • Consumer SDK. A thin wrapper over a standard inference client.

  • Contracts. Rust ckb-std type scripts: registry, provider, listing, stake, balance, treasury.

Off chain is TypeScript in strict mode. On chain is Rust with ckb-std. CKB access goes through CCC, Fiber access through a typed JSON-RPC client. There are contract tests against a synthetic harness and end to end tests against a local offckb devnet.

Keeping providers honest

Because providers are anonymous individuals, the network cannot just trust them:

  • Liveness checks confirm a provider is reachable.

  • Inference checks send a real prompt and confirm a sane response comes back in reasonable time.

  • Honesty checks probe against the weights hash registered on chain, so a provider that quietly swaps in a cheaper model than it listed can be caught.

  • Repeated failures slash the provider’s stake and drop its reputation, which removes it from routing.

Probing is done by routers and indexers as a side effect of their work, not by one central authority, and the results feed reputation anyone can verify.

Current status

  • Running on public testnet.

  • A working chat page where you load a local model, register it on chain, open a Fiber channel, and pay per request.

  • Provider and consumer quickstarts.

  • Register CLIs for publishing a provider, a model, and a listing.

  • Contract tests and end to end tests against offckb.

It is early and rough in places. The spec is a draft and I expect parts of it to change based on feedback.

What I would love feedback on

  • The economic model. Is per request pricing over Fiber the right shape for the long tail, or should prepaid balance be the default from day one?

  • The honesty checks. Hashing weights is a blunt instrument. What would you probe to prove a provider serves what it claims?

  • The trust split between CKB and Fiber. Anything you would move on chain or off chain.

  • Routing and liquidity. The hub and spoke channel model needs well capitalized routers. Is that a fair thing to build on?

  • The cell design for the registry, listing, and stake scripts.

Where this could go

If serving a model becomes as easy as running one command and getting paid for it, the long tail of compute and the long tail of fine tuned models get a payment rail they have never had. That is a genuine use of CKB and Fiber together: small shared state plus high frequency settlement, in service of something people already want to do.

The code is open. Please be direct and critical. I would rather hear what is wrong now than after I have built more on top of it.

GitHub: GitHub - truthixify/infern: Infern — a compute marketplace where individuals serve AI models from their own hardware and get paid per request over CKB Fiber. · GitHub

Thanks for reading.

13 Likes