Sincere thank you to Anna Kazlauskas (Vana), Marvin Tong (Phala), Dylan Zhang (Pond), Shu Dong (MIZU), and Sam Lehman (Symbolic) for their thoughtful insights, discussion, and review for this piece.
An Overview of the AI and Crypto Landscape
The past year has brought a wave of new companies building at the intersection of blockchain and artificial intelligence. These companies aim to develop permissionless, decentralized, and open-source alternatives to traditional centralized AI platforms. While most companies share a common vision of decentralizing AI applications and infrastructure, we see numerous categories within AI in which these companies are building.
From data management and model training to computation and deployment, the AI stack is becoming increasingly diverse. This diversification reflects the complex nature of AI systems and the multifaceted approach required to decentralize the technology.
The purpose of this article is to provide a comprehensive framework for categorizing blockchain AI companies. By examining their specific areas of concentration, technological approaches, and value propositions, we aim to offer clarity in an increasingly crowded and complex field.
Framework
The AI space can be effectively represented using a framework of four key components and two primary workflows. This framework consists of components: data, models, agents, and compute, along with the workflows of training/fine-tuning and inferencing. A diagram follows, defining each component and illustrating the relationships between them.
Let's examine each component and workflow in detail, exploring the companies at the forefront of innovation in these areas and the key challenges they're addressing.
Data
Data is the foundation for all AI models. Data is used at all layers of model development - training, testing, and validation. Blockchain technology is increasingly being leveraged to revolutionize data management in AI, particularly in two key areas:
- Data Ownership and Privacy: As proprietary data becomes increasingly valuable in AI development, blockchain offers solutions for data monetization and privacy protection. Contributors can maintain control over their data, securely sharing it while preserving confidentiality.
- Data Contribution and Labeling: Blockchain systems can create decentralized marketplaces where contributors are incentivized with tokens for providing and labeling data. This approach aims to improve data quality and availability while potentially reducing costs.
Several companies are leveraging web3 technologies to revolutionize data collection and filtering processes. Key areas include:
- Data Ownership: Emphasizes collecting and maintaining proprietary data privacy while enabling monetization.
- Crowdsourced Labeling: Incentivizes a network of human contributors to annotate raw data.
- Synthetic Data Generation: Utilizes AI models to create artificial datasets, reducing reliance on human-generated data.
- Decentralized Web Scraping: Employs a distributed network to crawl and gather web data.
- Open-Source Collaboration: Facilitates community-driven contribution and editing of datasets for open-source initiatives.
Models
Models are algorithms that learn patterns from data to make predictions or generate outputs. Currently, the most advanced and widely-used foundational models are large language models (LLMs) developed by major tech companies like OpenAI, Google, Amazon, and Meta (formerly Facebook). This dominance stems from the resource-intensive nature of model training, which requires vast amounts of data and substantial computing power.
Many Web3 applications leverage these existing models, either by paying for access to proprietary versions or by utilizing open-source alternatives. However, as blockchain technology advances, we're beginning to see the emergence of decentralized approaches to AI model development.
Several projects are pioneering blockchain-specific foundational models. For instance, POND is exploring the use of Graph Neural Networks (GNN) to learn and predict wallet behaviors (such as trading, social interaction, malicious transactions) based on blockchain data. These initiatives aim to create AI models that are not only powerful but also aligned with Web3 principles of decentralization and community ownership. The following table presents a comparative overview of widely-used AI models, including both mainstream options and blockchain-specific innovations like POND.
As this field evolves, we can expect to see more blockchain-native AI models that cater specifically to the unique needs and characteristics of decentralized ecosystems. These developments could potentially shift the AI landscape, introducing new paradigms for model training, deployment, and governance in the Web3 space.
Agents
Agents in the AI context are the type of applications powered by models, designed to take actions to achieve specific goals or complete tasks. In the Web3 space, agent development is following two main trends:
- AI-Powered Application Platforms: These projects focus on enabling developers to build on-chain applications powered by AI models, primarily Large Language Models (LLMs). Examples include:some text
- Interactive anime character companions
- Asset management bots that can execute transactions through conversational interfaces
- Blockchain-Data AI Specialists: These agents use AI models in conjunction with blockchain data to develop specific capabilities, such as:some text
- Security monitoring of blockchain networks
- Automated trading strategies
- AI-assisted smart contract generation and auditing
Compute
Compute in AI refers to the processing power required to train and run (inference) AI models. The integration of compute resources with Web3 principles presents a compelling use case, leveraging token incentives to create decentralized computing networks. This approach can potentially harness a vast array of resources, from high-performance GPU clusters to edge devices like smartphones and personal computers. The following table illustrates key companies developing innovative solutions within each category.
While decentralized AI compute networks show promise, they currently face significant limitations. Training large models on these networks is challenging due to communication latency and the complexities of coordinating distributed resources. Consequently, the current focus is primarily on inference tasks, which are more feasible in a decentralized environment as they typically require less intensive real-time coordination.
Training
Training is the critical process of feeding vast amounts of data into algorithms that iteratively adjust a model's parameters. This process forms the backbone of creating sophisticated AI models. In contrast to the Web2 paradigm, where data and models are predominantly controlled by a few large entities, Web3 introduces a paradigm shift:
- Distributed Ownership: Web3 prioritizes a decentralized network of individuals and smaller entities, enabling data providers and model trainers to effectively monetize their assets.
- Enhanced Transparency: Web3 brings unprecedented transparency to AI development. Models become auditable, allowing scrutiny of training data sources and capabilities, thus mitigating risks of undue influence or exploitation.
Two main categories of projects are driving innovation in this space:
- Contribution Tracking and Monetization Platforms: These systems facilitate and record contributions from various parties involved in model development. By tracking each participant's input, they ensure fair compensation when a model is utilized, creating a more equitable AI ecosystem.
- Decentralized Training: Innovative approaches are being developed to decentralize the training process itself. These methods distribute the model across a network and implement parallel training algorithms, leveraging collective computational power. This approach not only democratizes AI development but also potentially enhances training efficiency and model robustness.
Inferencing
Inferencing is the process of utilizing a trained AI model to make predictions based on new input data, without modifying the model itself. In the Web3 context, this process leverages decentralized computing networks, offering either software services that connect user inference requests to external compute networks or proprietary compute networks providing inference services directly.
A critical aspect of Web3 inference is verification of computation, proving that decentralized GPUs conduct authentic computation. Key verification methods include consensus, where multiple nodes run the model and compare results; OPML (Optimistic Machine Learning); ZKML (Zero-Knowledge Machine Learning); and TEE (Trusted Execution Environments). These verification methods ensure the integrity and trustworthiness of inference results in a decentralized environment.
Inference results can be further processed on-chain via smart contracts, enabling sophisticated on-chain applications. For example, developers can create asset management chatbots or agents that use Large Language Model (LLM) inferences to trigger on-chain transactions, bridging the gap between AI capabilities and blockchain functionality.
Three main categories of projects are driving innovation in this space:
- Verification and Onchain Application: These projects verify inference results and provide tools to build on-chain applications using these results.
- Verification and Co-Processing: Some focus solely on providing verification techniques, while others also offer compute power from their own networks.
- Trusted Computing: These provide inference capabilities without built-in verification, primarily offering compute power from their own networks.
A significant advancement in the inference category is Retrieval-Augmented Generation (RAG), a technique that enhances inference by referencing specific knowledge bases. RAG complements traditional inference methods to deliver more tailored and contextually relevant responses. This approach is particularly effective in creating personalized AI agents that can access and utilize individual user information and preferences.
RAG's versatility extends to the development of domain-specific expert agents. By referencing specialized knowledge bases, these agents can demonstrate a deeper understanding of particular fields, offering more accurate and nuanced insights. It's important to note that RAG doesn't involve retraining the underlying model; rather, it enriches the output by incorporating relevant information from the knowledge base during the inference process.
Others
Beyond the main categories, several innovative projects are employing diverse approaches to advance blockchain AI. These initiatives primarily focus on enhancing data privacy, ensuring computing privacy,optimizing processing techniques, and/or improving human authentication methods. Each project addresses critical challenges in the blockchain AI ecosystem, contributing to its overall progression and security.
Key areas of focus include:
- Data Privacy: Developing methods to protect sensitive information while enabling AI processing
- Incentive Design: Creating novel incentive frameworks for a variety of tasks
- Computing Privacy: Ensuring that AI operations remain confidential and secure
- Human Authentication: Implementing robust systems to verify human users in AI interactions
Token Utility
The question "Why do you need a blockchain or a token?" is frequently posed by engineers and investors alike. To justify the implementation of a token, it's crucial to understand its role as an incentive mechanism within a project's ecosystem. Tokens become particularly valuable when a network relies on distributed resources for its functionality and growth.
Three key areas where token incentives are most effectively applied:
- Network Security: Tokens incentivize nodes to participate in consensus mechanisms, validating transactions and securing the network.
- High-Quality Data Contribution: By rewarding data contributors with tokens, projects can encourage the submission of valuable, accurate, and diverse datasets. This is especially crucial for AI and machine learning applications that rely on high-quality training data.
- Distributed Computing Power: Tokens can motivate individuals or organizations to contribute their computing resources to the network. This model enables projects to access scalable computational power without relying on centralized infrastructure.
Projects incorporating any of these elements often find it easier to justify their token's utility.
Challenges and Opportunities
A helpful framework we use to evaluate Crypto x AI projects is to separate them into two distinct categories. Each bring their own set of challenges and opportunities:
- AI for Crypto: Utilizing AI models, especially those trained on blockchain data, to predict user behaviors in the crypto space. some text
- Challenge: Limited market size due to the current relatively small user base in crypto and blockchain.
- Opportunity: Potential for significant growth as blockchain adoption increases over time, expanding the market size.
- Crypto for AI: Leveraging crypto incentive mechanisms to enhance AI capabilities, such as protecting data/model ownership, increasing transparency in training and inference, and reducing costs. some text
- Challenge: Difficulty in competing with Web2 performance in terms of model quality, training speed, and overall efficiency.
- Opportunity: The scarcity of public internet data for AI training is increasing the demand for diverse, private data sources where blockchain can offer secure, transparent, and incentivized data sharing systems. Also, AI models become more compact and efficient, enabling distributed training and inference.
As the AI and crypto landscape evolves, projects in each category must strategically navigate their unique limitations and opportunities. Projects utilizing AI for crypto should focus on serving existing users with premium, tailored services while managing expenses in line with mainstream crypto adoption rates. This approach allows them to capitalize on their niche expertise and build a sustainable business model within the current market constraints.
Conversely, projects leveraging crypto for AI face challenges in competing with Web2 model performance. To succeed, they need to identify and exploit their competitive advantages, primarily in areas such as reduced compute costs and enhanced data sourcing capabilities. By focusing on these strengths, they can carve out a distinct value proposition in the AI market.✦
Legal Disclosure: This document, and the information contained herein, has been provided to you by Hyperedge Technology LP and its affiliates (“Symbolic Capital”) solely for informational purposes. This document may not be reproduced or redistributed in whole or in part, in any format, without the express written approval of Symbolic Capital. Neither the information, nor any opinion contained in this document, constitutes an offer to buy or sell, or a solicitation of an offer to buy or sell, any advisory services, securities, futures, options or other financial instruments or to participate in any advisory services or trading strategy. Nothing contained in this document constitutes investment, legal or tax advice or is an endorsement of any of the digital assets or companies mentioned herein. You should make your own investigations and evaluations of the information herein. Any decisions based on information contained in this document are the sole responsibility of the reader. Certain statements in this document reflect Symbolic Capital’s views, estimates, opinions or predictions (which may be based on proprietary models and assumptions, including, in particular, Symbolic Capital’s views on the current and future market for certain digital assets), and there is no guarantee that these views, estimates, opinions or predictions are currently accurate or that they will be ultimately realized. To the extent these assumptions or models are not correct or circumstances change, the actual performance may vary substantially from, and be less than, the estimates included herein. None of Symbolic Capital nor any of its affiliates, shareholders, partners, members, directors, officers, management, employees or representatives makes any representation or warranty, express or implied, as to the accuracy or completeness of any of the information or any other information (whether communicated in written or oral form) transmitted or made available to you. Each of the aforementioned parties expressly disclaims any and all liability relating to or resulting from the use of this information. Certain information contained herein (including financial information) has been obtained from published and non-published sources. Such information has not been independently verified by Symbolic Capital and, Symbolic Capital, does not assume responsibility for the accuracy of such information. Affiliates of Symbolic Capital may have owned or may own investments in some of the digital assets and protocols discussed in this document. Except where otherwise indicated, the information in this document is based on matters as they exist as of the date of preparation and not as of any future date, and will not be updated or otherwise revised to reflect information that subsequently becomes available, or circumstances existing or changes occurring after the date hereof. This document provides links to other websites that we think might be of interest to you. Please note that when you click on one of these links, you may be moving to a provider’s website that is not associated with Symbolic Capital. These linked sites and their providers are not controlled by us, and we are not responsible for the contents or the proper operation of any linked site. The inclusion of any link does not imply our endorsement or our adoption of the statements therein. We encourage you to read the terms of use and privacy statements of these linked sites as their policies may differ from ours. The foregoing does not constitute a “research report” as defined by FINRA Rule 2241 or a “debt research report” as defined by FINRA Rule 2242 and was not prepared by Symbolic Capital Partners LLC. For all inquiries, please email info@symbolic.capital. © Copyright Hyperedge Capital LP 2024. All rights reserved.