Decentralised Compute for AI Development

This report explores the subsectors of the decentralised compute networks and their representatives, and how they contribute to the development of AI.

Apr 08, 2024
Ai Decentralised Compute

Research Disclaimer

Crypto.com Research and Insights disclaimer for research reports


Executive Summary

  • Decentralised compute networks are systems that leverage blockchain technology to deliver computing services in a decentralised and secure manner. They can be categorised into decentralised compute, decentralised machine learning training, Zero-Knowledge machine learning (ZKML), and ZK coprocessor.
  • Decentralised compute provides unused computational resources like GPU and creates open computational marketplaces. 
    • Render Network is a platform that consists of 5,600 GPU provider nodes and utilises over 50,000 GPUs worldwide. It operates by pooling dormant GPU power and establishing a marketplace where individuals and entities can leverage this untapped resource. 
    • Akash Network enables users to lease high-performance GPUs for artificial intelligence (AI) training and inferences, primarily targeting AI developers in need of GPU spot instances.
  • Decentralised machine learning (ML) training is derived from general decentralised compute and focuses on the training process in ML. 
    • Bittensor is a decentralised protocol that facilitates collaboration in ML and incentivises the production of machine intelligence. It introduced subnets, which are specialised networks dedicated to specific ML use cases or resource provision.
    • Gensyn is a decentralised and incentivised market for ML compute, addressing the challenge of verifying completed ML work in decentralised compute networks.
  • ZKML is a combination of ML and AI techniques with Zero-Knowledge (ZK) proofs, enabling the verification of complex ML models and algorithms without exposing their details or the underlying training data.
    • ZKML is used in Worldcoin to securely store biometrics on mobile devices. Users can generate iris codes using an ML model and create a ZK proof locally to validate the iris code’s creation.
  • ZK coprocessors work by enabling smart contracts to trustlessly delegate historic on-chain data access and compute over them using ZK proofs, with projects like Axiom at the forefront. 
  • By harnessing the power of distributed networks, decentralised compute networks empower developers to unlock additional GPUs, access more cost-efficient alternatives, and tap new computational resources, which they can use to build their own AI models and applications.

1. AI and Compute

1.1 Issues in Scaling AI

The artificial intelligence (AI) sector has made leaps and bounds in terms of its AI models’ capabilities. Large AI models today have marked a shift in scaling, as they not only enable better ‘narrow’ AI systems that master a single task but lead to ‘general’ AI systems that can handle various tasks. As AI models exponentially grow in terms of parameter size, AI is expected to continually improve — and consume more computational resources in model training as a result. 

One of the most important resources in high demand today, Graphics Processing Units (GPUs) have become strategic and competitive resources amongst AI companies — and even amongst countries. A GPU is a specialised processor designed for handling graphics manipulations and parallel data processing through a large number of cores. GPUs excel in AI applications because they can handle large amounts of processed data in parallel, pushing batch instructions out at high volumes to speed up processing and display.

In the context of AI, GPUs are more favourable than Central Processing Units (CPUs) due to their ability to accelerate high-performance computing tasks, especially those involving parallel processing. CPUs, however, are well-suited for tasks that involve sequential processing, single-threaded applications, and other operations that do not benefit significantly from parallelisation.

Image1
Sources: Arxiv, Gensyn

The accelerated growth of large AI models has resulted in an exponential demand for GPUs, driven by supply chain and geopolitical issues, hence resulting in the immense cost of training in AI. Typically, developers train their models using NVIDIA’s more advanced (e.g., enterprise-grade) GPUs, which can cost up to US$10,000-US$12,000 per unit. OpenAI’s GPT-3 needed 1,000 GPUs and Stability AI needed 4,000 GPUs, costing these companies millions just to train their models on their dataset.

Screenshot 2024 04 08 At 9.26.36 am

GPUs like NVIDIA A100 and H100 clusters are the workhorses powering the race of AI. Realising the potential revenue opportunity that AI presents, centralised cloud tech platforms have been stockpiling GPUs. This is to amass computing power for their own usage, and in some cases to also rent computing power out to customers willing to pay the price. 

Image11

As a result, developers have a “Sophie’s choice” moment when it comes to getting the computational power needed to train the models: Either invest in their own hardware or use one of the cloud providers by paying inflated pricing. Due to the sheer expense of computing power needed, development has largely been left in the hands of tech giants who can afford it.

1.2 Decentralised Compute Networks

The AI race has resulted in a rapidly escalating demand for computing resources. Hence, the need for scalable computing power is crucial for the continued development of AI, and decentralised compute networks provide a potentially sustainable solution to the key challenges mentioned above.

In our previous report, we discussed the emerging decentralised compute network category under the DePIN sector. A decentralised compute network is a system that leverages blockchain technology to deliver computing services in a decentralised and secure manner. In this network, the processing power and resources required for executing tasks are not concentrated in a single central server or data centre. Instead, the workload is distributed amongst multiple nodes in the network, allowing for parallel processing and increased scalability. 

Decentralised compute networks offer several advantages and benefits for those who want to develop their own AI solutions, including: 

  • Cost Efficiency: While decentralised networks may require more machines, leveraging low-cost and idle resources globally can lead to cost savings by tapping into cheaper and underutilised computing and storage resources.
  • Resilience Against Censorship: A pivotal advantage of decentralised compute is its resilience against censorship, providing a counterbalance to the increasing concentration of AI development amongst a few large technology firms.
  • Resource Utilisation: Decentralised compute protocols are looking to utilise the unused computing power in our world, building open-source models through compute incentivisation.

The decentralised compute network market has seen substantial growth since 2023, coinciding with an increased interest in AI and key AI-industry developments. This is reflected in the market cap of decentralised compute network tokens, which rose by 507% over the past year. Within this period, Render Network had the largest market cap, surpassing US$4.19 billion. Akash Network has seen the most significant growth, from US$487 million in February 2023 to US$1.3 billion today (1,217%).

Image3

2. Landscape of Decentralised Compute Networks

There are three emerging categories within decentralised compute networks today, mainly decentralised compute, decentralised ML training, as well as Zero-Knowledge machine learning (ZKML) and coprocessors. They offer diverse solutions for the issues that AI is currently facing with the integration of blockchain technology.

Feature Ai Decompute Landscape

2.1 Decentralised Compute

Decentralised compute networks provide unused computational resources like GPU and create open computational marketplaces. With a focus on developing and providing infrastructure, these markets unlock a significant amount of new GPU supply and computational efforts, enabling anyone in the world to become a resource provider. 

Render Network and Akash Network are two of the most popular decentralised computing solutions today. They distinguish themselves in this category by specialising only in GPU services powered by a distributed network of GPUs.

Screenshot 2024 04 08 At 9.48.23 am

Render Network was founded in 2016 by Jules Urbach, CEO of OTOY, a software company notable for cloud rendering services that deliver high-quality 3D graphics. The network consists of 5,600 nodes (GPU providers) and leverages over 50,000 GPUs globally for faster, comprehensive rendering services. It pools dormant GPU power and creates a marketplace where individuals and entities (‘Creators’) can leverage this unutilised resource to execute tasks like processing high-quality graphics, training AI models, and rendering frames, amongst others. 

In the past year, Render Network has seen some growth, led by increases in average compute work per job, which can be attributed to a small cluster of rendering jobs for the Las Vegas MSG Sphere and Vision Pro. In November 2023, it migrated its core infrastructure from Ethereum to Solana in preparation for future network growth. 

Image10

Akash Network launched its first mainnet in 2020, initially focusing on a permissionless cloud compute marketplace featuring storage and CPU leasing services. It diversified its offerings in 2023, first launching a new testnet offering GPUs, followed by a GPU mainnet (Akash ML) in September 2023 that enables users to lease high-performance GPUs for AI training and inferences. Akash ML is aimed at AI developers seeking GPU spot instances, with plans to offer on-demand access in the future.

Screenshot 2024 04 08 At 9.49.32 am

Akash Network recorded new highs in 2023 for the number of new and total GPU leases on its network. Supply and demand for network resources are steadily increasing side-by-side, led by the training effort currently underway for Akash-Thumper — the first foundation AI model to be trained on a distributed network — as part of the retraining of a Stable Diffusion model.

Crypto.com GPU Resources Leased

Designed to power rendering work on the Render Network, RNDR tokens can be earned by Node Operators for work rendered on their nodes. Akash Network’s AKT token is used to secure the network and acts as a reserve currency in the Cosmos ecosystem.

Read more about these projects from our previous report DePIN: Crypto’s Rising Narrative.

2.2 Decentralised ML Training

Besides the generalised computing market, there are some networks specifically focused on the training process in machine learning (ML). 

ML is an implementation of AI, which is a much larger field, broadly referring to the development of computer programmes that can make decisions autonomously; ML is a subset of AI that typically refers to computer programmes that can learn how to make those decisions from data they ingest — all on their own. In this context, ‘models’ are the programmes that make these decisions and ‘training’ describes the process of instilling knowledge within them. Bittensor and Gensyn are the two representative projects specifically focused on ML. 

Powered by the Proof of Intelligence consensus mechanism, Bittensor is a peer-to-peer decentralised protocol facilitating machine learning collaboration and incentivising the production of machine intelligence. As part of its Revolution upgrade, Bittensor introduced subnets, which are specialised networks dedicated to a specific machine learning use case or resource provision.

By deploying a subnet, projects can earn a daily allocation of TAO (native token) emissions and hire a dedicated mining community to execute specific tasks assigned to the subnet. The registration cost doubles every time a project registers a subnet: if no one registers, the price will halve linearly within four days. The cost to register a subnet on Bittensor has substantially increased over the past few months, driven by the explosion of AI projects registering: It has gone up from 168 TAO (US$101,000) in October 2023 to a peak of 10,281 TAO (US$6.18 million) in March 2024.

Image8

One of the well-known challenges in building decentralised compute networks is the difficulty of the network verifying that the work has been correctly and honestly completed. Gensyn is a decentralised and incentivised market for ML compute. Designed to overcome the problem of verifying completed ML work, the platform leverages a Layer-1 trustless protocol for deep learning computation. Supply-side participants are rewarded for contributing their compute time to the network and performing ML tasks.

The protocol operates with the help of several participants, each playing a crucial role in the network:

  • Submitters: End users of the protocol who submit deep learning tasks to the network to be computed.
  • Solvers: Perform the compute work and generate receipts of completed work.
  • Verifiers: Ensure the work has been performed as requested through partial replication.
  • Whistleblowers: Serve as the final line of defence, checking the verifiers’ work and possibly challenging incorrectly performed work.

Initiating a compute task in Gensyn involves Submitters submitting the details of the task onto the chain, along with the pre-processed training data and public locations of the model binary. An estimate of required work is then generated after submitting a task. After this, larger computational workloads can be split into sets of tasks and pushed to the network asynchronously. Completing the task requires undergoing subsequent stages, including profiling, training, proof generation, and verification of proof (via proof of learning).

2.3 ZKML and ZK Coprocessors

2.3.1 ZKML

ZKML is a combination of ML and AI techniques with Zero-Knowledge (ZK) proofs. It enables the verification of complex ML models and algorithms without exposing their details or the underlying training data. Below are the ways ZK proofs can be applied in ML:

  • Model Authenticity: ZK proofs can verify that the ML model being used is the one claimed by the entity, ensuring transparency and preventing fraud.
  • Model Integrity: ZK proofs can ensure that the same ML algorithm is applied consistently to different users’ data, avoiding arbitrary bias and maintaining fairness.
  • Attestations: ZK proofs can integrate attestations from external verified parties into models or smart contracts, verifying the authenticity and provenance of information or data.
  • Decentralised Inference or Training: ZK proofs can perform ML inference or train in a decentralised manner, ensuring privacy and trustlessness.
  • Privacy-Preserving Computation: ZK proofs can protect the privacy of the prover and the data being processed, allowing computations over sensitive data without revealing anything to the verifier.

Today, ZKML solutions are focused on verifying inference rather than the training phase (primarily due to the computational complexity of verifying training in-circuit), bringing ZK proof to the inference part of the ML model. Various types of ZK systems have been developed, though in most cases, the ZK in ZKML usually refers to ZK-SNARKS.

Image

Potential use cases at Worldcoin

ZKML finds a potential application in Worldcoin by allowing World ID users to securely store their biometrics on their mobile devices, generate iris codes using a specific ML model, and create a ZK proof locally to validate the iris code’s creation. 

This process enables users to upgrade their iris codes without the need to revisit an Orb (a device used for iris scanning to verify World ID, a unique human identity within the Worldcoin ecosystem), ensuring compatibility with any algorithm changes made by Worldcoin. The ZK proof validates the iris code creation, allowing users to seamlessly update their biometric data within the Worldcoin system without compromising security or privacy.

2.3.2 ZK Coprocessors

ZK coprocessors are a new type of blockchain infrastructure that allows smart contract developers to perform off-chain computations over existing on-chain data using ZK proofs. This enables applications to access more data and operate at a greater scale than was previously possible. 

ZK coprocessors are similar to hardware coprocessors, which are separate chips designed to augment a CPU and provide specialised operations. In the context of blockchain computers, ZK coprocessors enhance Ethereum by allowing smart contracts to trustlessly delegate historic on-chain data access and computation over it using ZK proofs. They introduce a new design pattern for on-chain applications, allowing developers to delegate costly operations from the blockchain virtual machine (i.e., EVM) to a ZK coprocessor, thereby decoupling data access and compute from blockchain consensus.

Axiom is a platform built on Ethereum that enables developers to write smart contracts with trustless access to historical on-chain data and verified computation through ZK proofs. It addresses the limitation of smart contracts on Ethereum, which can only access the current state and not historical data. Axiom allows smart contract developers to perform ZK-verified computations over the entire history of Ethereum. It imports on-chain data, verifies it through ZK-SNARK, and allows for ML operations. Axiom provides ZK proofs along with query results, allowing users to authenticate the accuracy of online information and tackle issues like deepfakes.

3. Conclusion

The intersection of AI and blockchain is arguably one of the most interesting narratives driving the current cycle. This sentiment is evident from the increased adoption and activity in these decentralised compute networks, as well as ongoing exploration and investments in this particular segment. AI as a sector is accelerating rapidly, and with the integration of blockchain, we can expect more progress soon. 

Decentralised compute networks pose as a viable solution to help address the scaling, resourcing, and cost issues surrounding the AI sector. By harnessing the power of distributed networks, they empower developers to unlock additional GPUs, access more cost-efficient alternatives, and tap new computational effort resources, which they can use to build their own AI infrastructure and deployments. 

AI development sees a positive outlook based on significant use cases found within decentralised AI compute and GPU networks, decentralised ML training, and the ZKML and coprocessors categories. As AI development becomes more decentralised and moves on-chain, we can expect more verticals and key players to emerge in the market. 

The infrastructure, development, and approaches to AI will be increasingly decentralised but gradually become a category where we see the convergence of the Web2 and Web3 worlds with notable innovations. Decentralising AI democratises it, making it poised to unlock its full potential and pave the way for a future where strategic resources are being fully capitalised on and artificial super-intelligence can become a reality.

Read the full report: Decentralised Compute for AI Development


Authors

Crypto.com Research and Insights team


Get the latest market, DeFi & NFT updates delivered to your inbox:

Be the first to hear about new insights:

Share with Friends

Ready to start your crypto journey?

Get your step-by-step guide to setting up an account with Crypto.com

By clicking the Get Started button you acknowledge having read the Privacy Notice of Crypto.com where we explain how we use and protect your personal data.
Mobile phone screen displaying total balance with Crypto.com App

Common Keywords: 

Ethereum / Dogecoin / Dapp / Tokens