Zk Coprocessors can be divided into two types: data access zk Coprocessors, and computation zk Coprocessors [1]. Both use succinct proofs, but each is intended to solve a different type of problem. For data access zk Coprocessors, the problem is trust, whereas for computation zk Coprocessors, the problem is scalability.
The Problem
Trust
Data access zk Coprocessors are a relatively new technology that allows developers to trustlessly access historical blockchain data using zero-knowledge proofs. It may seem counterintuitive that we need new cutting-edge technology to look up data from previous blocks. After all, the concept of the blockchain is that it’s a public record whose entries are universally agreed upon, become more certain with each passing block, and can never be erased or changed after the fact. Also, from a practical standpoint, most data is historical data: the current block becomes a historical block very quickly (Ethereum’s block time is about 12 seconds), and with no succeeding block confirmations the information in the current block is still suspect.
And indeed, if you trust the archive node from which you obtain blockchain data, there is no need to introduce a data access zk Coprocessor. The problem arises if you don’t trust the data provider. Even if you do, part of the project of web3 is to eliminate trust assumptions, so it is worth the effort to achieve the same results trustlessly. Without using some kind of succinctly verifiable computation, colloquially known as “zk,” (short for zero-knowledge), eliminating this trust assumption is quite difficult.
In a September 2023 interview, Axiom cofounder Yi Sun explained the difficulty with trustlessly obtaining historical on-chain data [2]. To illustrate the problems developers face interacting with historical data, he used the hypothetical example of obtaining the price of Ether from the Chainlink smart contract one year prior, for use in a current smart contract.
“You basically have two options as a developer. One option, which is in principle trustless, is that every block of Ethereum commits to the entire history of Ethereum. That’s the definition of a blockchain. And so, if you had infinite computational power, what you could do is exhibit to your smart contract… every one of the past 1 million block headers, check that the hash of each block header is in the next block header (so that would establish that a block header one million blocks ago is what you say it is) and then check that the commitment to the contract storage variable of Chainlink a million blocks ago is what you say it is via a Merkle proof.”
While Ethereum light clients use this same strategy for obtaining on-chain data, there are serious limitations to this approach. One is that the process is so computationally expensive that it would be impossible to replicate on-chain. Even checking the Merkle proof on-chain would be prohibitively expensive. There are workarounds that don’t involve verifiable computation, but these also have their drawbacks. Yi continued:
“What developers do today to work around this, is they basically have two choices. One is that they can keep additional data around in their smart contract. That would incur a greater cost every single time they update, but you’d have more access. The challenge with that is that, even if you don’t care about the cost, it means when you first deploy your contract, you kind of have to anticipate all future uses of data in your contract. And you have to cache data in anticipation of that.”
“Now of course you have another option as a developer, which is that, instead of having a trustless on-chain solution, you can kind of just put the number on-chain yourself, and maybe you call it an oracle, maybe you don’t. But that obviously has the challenge that your users have to trust you to accurately put that number on-chain.”
Data access zk Coprocessors solve this problem of trust when accessing historical on-chain data. The essential idea is that, after retrieving the data, the zk Coprocessor replicates the logic of validating Ethereum block headers, states, transactions, and receipts, to prove that the requested data was correctly included in its corresponding block(s).
Scalability
Given the cost of computation on Ethereum and the space restrictions imposed by a finite block size, for very complex calculations it may be necessary to perform some of the work off-chain. Without any additional technology, this would compromise the properties of decentralization and trustlessness that we would like to preserve. Computation zk Coprocessors solve this problem by using either succinct non-interactive arguments of knowledge (SNARKs) or succinct transparent arguments of knowledge (STARKs) to ensure computational integrity.
Those familiar with zk Rollups may recognize this concept. Indeed, there are similarities between a zk Rollup and a computation zk Coprocessor, but they are not exactly the same. Both use SNARKs or STARKs to prove the correct execution of some off-chain action. However, a zk Rollup is meant to prove that a large number of transactions were sequenced and batched correctly, whereas a computation zk Coprocessor proves that an arbitrary computation was executed correctly [3]. One way to think about the difference is this: a zk Rollup allows 10,000 people to send $5 each, whereas a computation zk Coprocessor allows one person to compute a forecast using a complex weather model. Both address the shortage of on-chain computational resources in different ways.
The flow of data for a computation zk Coprocessor is the following. A user submits a computation to the zk Coprocessor, which lives off-chain. The zk Coprocessor executes this computation, and either returns the result to the user directly or one of their smart contracts, and submits a SNARK proof of execution to a verifier smart contract. The verifier smart contract either accepts or rejects the zk Coprocessor’s proof; since the verifier smart contract lives on-chain, the result of the verification (accept or reject) is public.
Computation zk Coprocessors may be specialized for a specific type of computation (although if it is so specialized that it only handles exactly one computation it is better to call it a circuit). Usually, however, a computation zk Coprocessor is general-purpose, due to the existence of universal verifiers. A universal verifier is able to verify any circuit by taking (a compressed form of) the circuit as an input, much like how a universal Turing machine takes in a specific Turing machine as input. The math behind this construction is complex, and could fill another article. Suffice it to say that all the major computation zk Coprocessors on the market use a universal verifier smart contract.
It is also common that both the computation zk Coprocessor and universal verifier smart contract are written by the same company. While it may seem counterintuitive that the prover and verifier are the same entity, there is no actual loss of security or decentralization, since the universal verifier smart contract is open-source.
Differing Models of Computation, Differing Levels of Accessiblity
One complication that we have so far glossed over is that a SNARK technically does not prove the correct execution of a program in the traditional sense, but rather the satisfaction of a set of constraints representing an arithmetic circuit (also called a circuit when the context is clear). This arithmetic circuit, in turn, represents the program. Circuit engineers must train to write in domain-specific languages (DSLs) in order to write constraints for a circuit corresponding to a given program. A circuit is underconstrained if it is possible to satisfy all the constraints with an incorrect run through the computation, where the meaning of “incorrect” depends on the intended logic of the program. Underconstrained circuits can lead to unexpected behavior or security vulnerabilities. Given the amount of specialized knowledge needed to understand them, they are also difficult to audit.
There are several approaches to bridging the gap between programming in the traditional Turing machine model and writing arithmetic circuits. In order of “most like a circuit” to “most like a program,” the general categories are:
- Libraries
Libraries can make the process of writing circuits easier by modularizing some frequently-used sets of constraints. As we will talk about more later, this is the approach used by Axiom SDK. In this case the developer is still writing a circuit, but they have a few “shortcuts” they can use to make it easier.
- DSLs that are Programming Languages
A good example of this is ZoKrates. Instead of writing a circuit, the developer is writing in a domain-specific language, and the compiler generates the circuit constraints at compile time. This simplifies the process of writing circuits, with the slight trade-off that the developer gives up some ability to optimize their circuit. Note, however, that programming languages such as ZoKrates are not general-purpose: they are explicitly for writing circuits. As such they generally don’t have a large community of developers or a mature ecosystem.
- Programming Languages, Compiled to a Circuit by a zkVM
Here the developer writes a program in a general-purpose programming language such as TypeScript, and a zero-knowledge virtual machine (zkVM) compiles it to a circuit. The zkVM abstracts away constraint generation entirely. The developer is free to write any program that can be written in the Turing machine model of computation, although some programs, such as non-halting ones, won’t compile successfully. This is the approach adopted by ORA and RISC Zero in their respective zk Coprocessors.
For companies that offer a zk Coprocessor as one of their products or services, it is important to consider how they address the difference between circuits and programs. There are advantages and disadvantages to each, and which one is best for the developer depends on their intended use case and the trade-offs they are willing to make.
The following diagram represents the ladder of abstraction of zero knowledge proof systems. At the lowest, most fundamental level, a SNARK or STARK is either a small set of elliptic curve elements or hashes. In systems with a zkVM, the highest level is developer source code written in a high-level programming language; in systems without a zkVM, the highest level is an arithmetic circuit representing a specific computation.
If the most important factor is performance, then circuits are better, as the developer can optimize them “by hand.” The trade-off here is that learning to write circuits is difficult, and each new circuit must be audited to guarantee security before being put into production. On the other hand, programs compiled to circuits by a zkVM are generally less performant, since low-level optimizations are not possible; however, writing a program is much more straightforward than writing the equivalent circuit. Furthermore, once the circuits for the zkVM opcodes have been audited, the compiled circuits that result from composing those opcode circuits are guaranteed to be fully constrained. This doesn’t necessarily eliminate the need for auditing, but reduces the task to auditing the high-level code, a much easier proposition. Sometimes a zk Coprocessor will make use of a zkVM for general computation, but include some built-in optimized circuits for a fixed set of frequently-performed, computationally expensive (usually cryptographic) operations, and these circuits are audited separately.
Overview of Different Protocols
In this section we describe some of the different ways zk Coprocessors are implemented in companies that currently offer them. The goal is to give the reader a clear picture of each protocol by describing its developer experience and workflow, what use cases it is tailored to, and, where possible, its performance. This list is not exhaustive, but represents a good overview of the different approaches being taken in the zk Coprocessor space.
With respect to performance, it should be noted that there are currently no standard benchmarks for comparing zk Coprocessors. This is partly due to the fact that the field itself is relatively new, and partly because different zk Coprocessors often prove different statements, making apples-to-apples comparisons difficult. Even comparing proving schemes and their implementations is a complex problem: see [4].
This is not an exhaustive list. Furthermore, the landscape of zk Coprocessors is constantly changing, so new developments may arise at any time. Accordingly, we may update or give an addendum to this article as needed.
Axiom
Towards the beginning of this article, we quoted Axiom cofounder Yi Sun to explain the purpose of a data access zk Coprocessor. This is because Axiom started out as a data access zk Coprocessor, though now they also have a computation zk Coprocessor in their suite of tools. At the moment, Axiom queries are limited to 128 pieces of data at a time by default, although Axiom can support larger queries on a partnership basis [5].
Developers access the zk Coprocessor capabilities of Axiom through Axiom SDK, a TypeScript library. The process can be divided into three parts:
- Get data trustlessly from the blockchain (uses data access zk Coprocessor).
- Design a client circuit for computation (uses computation zk Coprocessor).
- Add return values to the final callback, to integrate with your own smart contract.
While Step 1 is handled simply by calling the appropriate functions from the library, Step 2 is more complicated, since it involves writing a circuit. Axiom SDK makes this somewhat easier by providing built-in mathematical operations, zero checks, and range checks. While the built-in operations modularize circuits that are not difficult to write per se, writing them from scratch can be very time-consuming, so using Axiom SDK does make the process considerably easier. There are still some difficulties imposed by the arithmetic circuit model of computation, however. In particular, loops and branching are not supported, control flow must be handled by multiplexers, and the total number of inputs must be known at compile time.
On the positive side, Axiom’s documentation is fairly thorough, with examples and tutorials to illustrate the development process, including how to integrate the client circuit with the frontend. Example client circuits are provided, and developers can test their client circuits with Foundry.
One notable feature of Axiom is that it is available on Ethereum mainnet as of January 22, 2024, and its core smart contracts and circuits have been audited. This indicates that they and their auditors are confident in the security of their protocol, and that community testing on Sepolia has so far been a success.
Axiom’s circuits are built using halo2-lib and use the halo2 proving system (Plonkish arithmetization with an inner product argument). This is highly performant in terms of prover time and proof size, especially for small circuits. In the later section on Maru, we refer to a comparison between halo2 and Starky.
ORA
ORA, formerly known as HyperOracle, is a zkOracle network and protocol. A zkOracle is an input/output blockchain oracle with the ability both for trustless data access and computation. So a zkOracle includes both a data access zk Coprocessor and computation zk Coprocessor, plus the ability to post the processed data to the blockchain when certain trigger conditions are met. This last feature is called zkAutomation.
To use ORA’s zkOracle network, a developer writes a computational entity or CLE (formerly known as a zkGraph), a TypeScript program following a certain predefined format. Technically, computational entities are written in AssemblyScript, which is just TypeScript with additional functionality for compiling to web assembly (WASM). This is also augmented with ORA’s custom library cle-lib (previously called zkgraph-lib), which provides functions to trustlessly retrieve Ethereum data.
ORA uses zkWASM, developed by Delphinus Labs, as the backend zkVM. This is what allows developers to write programs rather than circuits. Under the hood, the process is as follows.
- The developer writes a program in TypeScript + AssemblyScript + cle-lib.some text
- Trustlessly obtain data from the blockchain with cle-lib (uses data access zk Coprocessor).
- Specify custom computation in AssemblyScript (uses computation zk Coprocessor).
- Optionally add a data destination to send return data (uses zkAutomation).
- The program is compiled to a web assembly (WASM) binary file.
- The WASM binary is then compiled to a halo2 circuit by zkWASM.
- The circuit is sent to ORA’s network for proof generation.some text
- If zkAutomation is used, automations are paid for by a “bounty reward per trigger” system.
- The proof is sent to ORA’s universal verifier smart contract for verification.
WebAssembly was chosen from the outset as the instruction set in order to create compatibility with the Graph protocol: subgraphs can be changed to computational entities with just 10 lines of configuration difference. So developers who have built subgraphs, or those familiar with AssemblyScript, will have the easiest time using ORA’s zkOracle network. However, new developers should not find the process too difficult. Scaffolds are available so that developers can bootstrap their projects and the developer documentation is helpful. As of January 2024, there were 30 community-built computation entities and counting [6].
The zkWASM VM, and the custom circuits added to it, are written in halo2-pse, the halo2 library developed by Privacy and Scaling Explorations of the Ethereum Foundation, which uses Plonk with KZG as the underlying proving scheme. As of February 2024, proof generation by zkWASM servers in the ORA network takes around 40 seconds for most CLEs.
Currently, ORA’s network supports Ethereum mainnet only as a data source, not as a data destination, meaning that zkAutomation is not yet available on mainnet. ORA plans to expand zkAutomation to mainnet and other chains in the future.
Proxima One: Maru
Maru is a zk Coprocessor by Proxima One Labs that is focused on computation over large data sets. The prototypical example is the calculation of trading volume over a large block range. Essentially, Maru uses the Merkle mountain range structure, along with recursive STARK proofs, to prove the validity of a large collection of receipts (events) as leaf data in their respective receipt tries. Maru’s chosen proving system is Starky, an adaptation of Plonky2 with cross-table lookups that is built for fast recursion [7].
Currently, developers working with Maru write entirely in Solidity, with added functions for trustlessly accessing and processing Ethereum data. The developer then links a GitHub repo containing their augmented smart contracts to Maru’s platform, which then compiles circuit code and adds it to the repository automatically.
A benchmarking test released by Maru comparing the performance of Maru with Axiom and JumpCrypto on setup, prover time, proof size, and verification time of Keccak-256 circuits of various input lengths found the following.
- Setup: Axiom outperforms Maru for input sizes less than around 28KB, but for Keccak-256 circuits with input sizes larger than 28KB, Maru outperforms Axiom.
- Prover Time: Axiom outperforms Maru for inputs of size up to about 2KB, but Maru outperforms Axiom for input sizes larger than 2KB.
- Proof Size: Axiom outperforms Maru for all input sizes.
- Verifier Time: Axiom outperforms Maru for almost all input sizes.
This indicates that Maru is optimized to perform well for large circuits. See [8] for full details on the methodology.
The long-term vision for Maru is more ambitious, going beyond the role of zk Coprocessor to that of a shared execution layer whose function is to provide real-time composability of smart contracts across Layer 2 platforms. For this to be possible, it is necessary to generate all reasonable proofs within 12 seconds, the block time of Ethereum. To improve efficiency of the underlying computations on which the proofs are based, Maru is developing a DSL around a model of computation that operates not on states, but state deltas.
RISC Zero
RISC Zero the company is best known for its zkVM, also called RISC Zero, based on the Reduced Instruction Set Computer Five (RISC-V) instruction set. Recently, RISC Zero has released a computation zk Coprocessor, called Bonsai, that uses the RISC Zero zkVM as the backend [9].
The RISC Zero zkVM translates RISC-V opcodes to arithmetic circuits. Therefore it can be used to prove the correct execution of programs written in any language that compiles to RISC-V. This includes Rust and C++, though it appears that RISC Zero’s intended audience is Rust developers, since all projects we could find that use RISC Zero were based on Rust source code. This makes sense, since Rust is known for its performance and safety guarantees. The downside is that the Rust developer community is still much smaller than that of JavaScript or its variants, so the TypeScript-based libraries of Axiom and ORA are accessible to a wider base of new developers. However, Rust is more popular within the web3 community than in the general developer community, and the Rust community is growing.
As of February 2024, access to Bonsai is permissioned, since it is in the early alpha stage of development. Those interested in using Bonsai as part of their DApp will have to request an API key.
There are several finished open-source projects that use Bonsai. Interestingly, within four weeks RISC Zero engineers were able to write a zkEVM, called Zeth, using the computation zk Coprocessor Bonsai. A zkEVM is a kind of zkVM that specifically replicates the logic of an Ethereum validator. A zkEVM can prove that an entire Ethereum block is valid, in contrast to a data access zk Coprocessor, which proves the validity of specific information within a block.
The four-week creation of Zeth is an impressive proof of concept because, due to the complexities of EVM verification and the zk-unfriendly nature of EVM cryptographic primitives, data access zk Coprocessors normally take months or years to write. The fast developer time was possible due to the portability of Rust crates with RISC Zero zkVM, allowing for code reuse from non-ZK applications. Specifically, while the RISC Zero had their own accelerators for ECDSA verification, much of the work was ported over from the revm crate, an implementation of the Ethereum Virtual Machine written in Rust. This fast development time comes at the expense of optimization and prover time. Using parallelism, the RISC Zero team was able to verify an Ethereum block in 50 minutes, while estimating that with an appropriately-sized GPU cluster the proof could take from 9-12 minutes including execution time. This illustrates a principle that will likely hold in the future (even with improvements to Bonsai’s underlying zkVM) – there is a trade-off between code reuse and performance.
Summary
A developer wishing to build a dApp which is fully decentralized and trustless may use a zk Coprocessor for historical data access or verifiable off-chain computation. Which one they should use depends on the circumstances. The ideal use case for each zk Coprocessor seems to be the following.
- Axiom for developers who know Solidity, TypeScript, and how to write circuits and are looking for a performant solution involving small-to-medium-sized circuits and up to 128 pieces of historical on-chain data.
- ORA for developers who know TypeScript and are looking for ease of use or verifiable AI integration.
- Maru for Solidity developers who want to compute using large circuits, or to compute over large sets of Ethereum event data.
- RiscZero’s Bonsai for Rust developers who want to quickly bootstrap large projects through code reuse, specifically by turning Rust crates into circuits.
Some Thoughts on the Future of zk Coprocessors
It is always a safe bet to predict that the current technology will improve, and zk Coprocessors are no exception to this rule. The field is still very young, and there will likely be new theoretical breakthroughs as well as incremental optimizations at every level of the zk Coprocessor’s ladder of abstraction, leading to faster zk Coprocessors. As performance continues to improve, they will be used in a wider variety of decentralized applications. If prover time becomes fast enough relative to native computation, there will likely be a greater push to abstract away the arithmetic circuits, favoring protocols that use a zkVM.
One significant milestone in this direction is getting prover time significantly below 12 seconds. This is especially important for automation, as it allows for “tracking,” the blockchain: that is, scanning new blocks as they come in, and submitting proofs of computation of that data in (near) real time.
In addition, there are two areas for which the technology is still in its infancy, but that promise to be important in the future. Namely, AI and big data. Greater integration of zk Coprocessors with verifiable AI will become a necessity for developers who want to use the problem-solving capabilities of AI in a way that is trustless and censorship-resistant. And, as analyzing large data sets has proven to be desirable in web2, it seems likely that verifiably doing so will be essential in web3. These two areas may be related: perhaps in the future, AI verifiably trained on large sets of blockchain data will provide verifiable inferences on that data. ✦
References
[1] @Hill, SevenXVentures, “zkOracle and zk Coprocessor,” Nov. 28, 2023. https://mirror.xyz/sevenxventures.eth/_JMWKYVob9x3COlO7V2k71pN_pZORc4DVcByhfU0ZaQ
[2] Yi Sun and Rex, “Installing Ethereum’s ZK Coprocessor w/ Yi Sun (Axiom)”, Strange Water Podcast Episode 27, Sep. 2023. https://open.spotify.com/episode/699BT0RlOde6qhhqtGxX5A
[3] Mrig P., thirdweb, “ZK Rollup Comparison: What is a ZK Rollup & Which is Best?”, May 24, 2023. https://blog.thirdweb.com/zero-knowledge-rollups-zk/
[4] Daniel Benarroch, Aurelien Nicolas, Justin Thaler, and Eran Tromer, “Community Proposal: A Benchmarking Framework for (Zero-Knowledge) Proof Systems, April 9, 2020. https://docs.zkproof.org/pages/standards/accepted-workshop3/proposal-benchmarking.pdf
[5] Axiom Documentation. https://docs.axiom.xyz/
[6] Awesome ORA Github Repository. https://github.com/ora-io/awesome-ora
[7] Anton Yezhov, Denis Kanonik, Alex Rusnak, and the Maru Team, “Why Event Proofs? Why Starky?” October 2023. https://maru.network/blog/why_event_proofs
[8] Stanislav Karaschuk, Denis Kanonik, Kateryna Kuznetsova, Oleksandr Kuznetsov, Anton Yezhov, and Alex Rusnak, “Keccak256 circuit benchmarks”, Sept. 15, 2023. https://github.com/proxima-one/keccak-circuit-benchmarks/blob/master/full_description.pdf
[9] RISC Zero Bonsai Documentation. https://dev.risczero.com/api/0.20/bonsai/
[10] Tim Carstens, Victor Graf, Rami Khalil, Steven Li, Parker Thompson, and Wolfgang Welz, “Announcing Zeth: the first Type Zero zkEVM”, August 22, 2023. https://www.risczero.com/blog/zeth-release
Disclosures
The primary author of this article, Levi Sledd, previously worked for ORA (formerly HyperOracle) and Symbolic Capital is an investor in ORA. The author has taken care to be as unbiased as possible in all aspects of reporting for this piece. All information presented here is true to the best of the author’s knowledge. All errata will be taken into account and the article will be updated accordingly.
Legal Disclosure: This document, and the information contained herein, has been provided to you by Hyperedge Technology LP and its affiliates (“Symbolic Capital”) solely for informational purposes. This document may not be reproduced or redistributed in whole or in part, in any format, without the express written approval of Symbolic Capital. Neither the information, nor any opinion contained in this document, constitutes an offer to buy or sell, or a solicitation of an offer to buy or sell, any advisory services, securities, futures, options or other financial instruments or to participate in any advisory services or trading strategy. Nothing contained in this document constitutes investment, legal or tax advice or is an endorsement of any of the digital assets or companies mentioned herein. You should make your own investigations and evaluations of the information herein. Any decisions based on information contained in this document are the sole responsibility of the reader. Certain statements in this document reflect Symbolic Capital’s views, estimates, opinions or predictions (which may be based on proprietary models and assumptions, including, in particular, Symbolic Capital’s views on the current and future market for certain digital assets), and there is no guarantee that these views, estimates, opinions or predictions are currently accurate or that they will be ultimately realized. To the extent these assumptions or models are not correct or circumstances change, the actual performance may vary substantially from, and be less than, the estimates included herein. None of Symbolic Capital nor any of its affiliates, shareholders, partners, members, directors, officers, management, employees or representatives makes any representation or warranty, express or implied, as to the accuracy or completeness of any of the information or any other information (whether communicated in written or oral form) transmitted or made available to you. Each of the aforementioned parties expressly disclaims any and all liability relating to or resulting from the use of this information. Certain information contained herein (including financial information) has been obtained from published and non-published sources. Such information has not been independently verified by Symbolic Capital and, Symbolic Capital, does not assume responsibility for the accuracy of such information. Affiliates of Symbolic Capital may have owned or may own investments in some of the digital assets and protocols discussed in this document. Except where otherwise indicated, the information in this document is based on matters as they exist as of the date of preparation and not as of any future date, and will not be updated or otherwise revised to reflect information that subsequently becomes available, or circumstances existing or changes occurring after the date hereof. This document provides links to other websites that we think might be of interest to you. Please note that when you click on one of these links, you may be moving to a provider’s website that is not associated with Symbolic Capital. These linked sites and their providers are not controlled by us, and we are not responsible for the contents or the proper operation of any linked site. The inclusion of any link does not imply our endorsement or our adoption of the statements therein. We encourage you to read the terms of use and privacy statements of these linked sites as their policies may differ from ours. The foregoing does not constitute a “research report” as defined by FINRA Rule 2241 or a “debt research report” as defined by FINRA Rule 2242 and was not prepared by Symbolic Capital Partners LLC. For all inquiries, please email info@symbolic.capital. © Copyright Hyperedge Capital LP 2024. All rights reserved.