AI x Crypto - Promises and Realities

Hack VC

20 Jun 2024 • 16 min read

By Ed Roman, Managing Partner at Hack VC

AI is one of the hottest and most promising categories in crypto markets recently.

💡Decentralized AI training

💡GPU DePINs

💡Uncensored AI models

Are these breakthroughs or just buzzwords? 🤔

At Hack VC, we're cutting through the noise to separate promise from reality.

This post dissects the top crypto x AI ideas. Let's discuss the real challenges and opportunities.

Ideas with initial promise, but which have encountered challenges in reality.

First, let’s start with the “promise of web3 AI”—ideas that have quite a bit of hype, but for which reality may not be as shiny.

Idea #1: Decentralized AI training

The problem with AI training on-chain is that training requires high-speed communication and coordination among GPUs, due to neural networks requiring backpropagation when trained. Nvidia has two innovations for this (NVLink and InfiniBand). These technologies effectively make GPU communication ultra fast, but they’re local-only technologies that are applicable only within GPU clusters that are located within a single datacenter (50+ Gigabit speeds).

If you introduce a decentralized network into the picture, you’re suddenly orders of magnitude slower due to the added network latency and bandwidth. That’s a non-starter for AI training use cases compared to the throughput you get from Nvidia’s high-speed interconnectivity within a datacenter.

Note that there have been some innovations here that may provide some hope for the future:

Distributed training over InfiniBand is happening at considerable scale, as NVIDIA itself is supporting distributed, non-local training over InfiniBand via the NVIDIA Collective Communications Library. It’s still nascent, however, so adoption metrics are TBD. See here. The bottleneck that is the laws of physics over distance still applies, so local training over InfiniBand is still significantly faster.
There’s been some novel research published for decentralized training that relies on fewer communication syncs that could potentially make decentralized training more practical in the future. See here and here.
Intelligent sharding and scheduling of model training can help improve performance. Similarly, new model architectures may be uniquely designed for distributed infrastructure in the future (Gensyn is researching in these areas).

The data component of training is also challenging. Any AI training process involves working with vast amounts of data. Typically, models are trained on centralized and secure data storage systems with high scalability and performance. This requires transferring and processing terabytes of data, and this is not a one-time cycle. Data is usually noisy and contains errors, so it must be cleaned and transformed into a usable format before training the model. This stage involves repetitive tasks of normalization, filtering, and handling missing values. This all poses serious challenges in a decentralized environment.

The data component of training is also iterative, which doesn't suit itself well to web3. It took OpenAI thousands of iterations to achieve its results. The most basic task scenario for a data science specialist in an AI team includes defining objectives, preparing data, and analyzing and structuring data to extract important insights and make it suitable for modeling. Then, a machine learning model is developed to solve the defined problem, and its performance is validated using a test dataset. This process is iterative: if the current model does not perform as expected, the specialist returns to the data collection or model training stage to improve the results. Now, imagine this process in a decentralized setting, where best-of-breed existing frameworks and tools are not easily available in web3.

The other concern with training AI models on-chain is that it’s a much less interesting market compared to inference. For now, there’s an enormous amount of GPU compute being used for AI LLM training. But in the long run, inference will become (by far) the more prevalent use case of GPUs. Consider”: how many AI LLMs need to be trained for the world to be happy, compared to the number of customers who will use those models?

One solution making progress across all fronts is 0g.ai (backed by Hack VC) who provide both on-chain data storage and data availability infrastructure. Their ultra fast architecture and ability to store huge amounts of data on-chain allow for fast, iterative on-chain AI model training of any type.

Idea #2: Using overly-redundant computation of AI inference for consensus

One of the challenges with crypto x AI is verifying the accuracy of AI inference, since you can’t necessarily trust a single centralized party to perform that inference due to the potential for misbehaving nodes. This challenge doesn’t exist in web2 AI because there isn’t a decentralized consensus style system.

One proposed idea to solve this is redundant computation, where multiple nodes repeat the same AI inference operation, so that you can operate in a trustless manner and not have a single point of failure.

The problem with this approach is that we live in a world with a drastic shortage of high-end AI chips. There’s a multi-year wait period for high-end NVIDIA chips, and that results in price hikes. If you were to (additionally) require that your AI inference be re-executed multiple times on multiple nodes, you’re now multiplying those expensive costs. This is going to be a non-starter for many projects.

Idea #3: Web3-specific AI use cases in the near-term

There’s been suggestions that web3 should have its own, unique AI use cases that are specific to web3 customers. This can be (for example) a web3 protocol that uses AI to perform risk scoring of a DeFi pool, a web3 wallet suggesting new protocols for you based on your wallet history, or a web3 game that uses AI to control non-player characters (NPCs).

For now, this is (in the short term) a nascent market where use cases are still being discovered. Some challenges include:

Fewer potential AI transactions are needed for web3-native use cases, since the market demand is still in its infancy.
Fewer customers, as there are orders of magnitude fewer web3 customers compared to web2 customers, so the market is less fragmented.
Customers themselves are less stable since they’re startups with less funding, so some of those startups may die off over time. A web3 AI service provider who caters to web3 customers will likely need to re-acquire a portion of their customer base over time to replace the ones that die off, making it a more challenging business to scale.

Longer term, we’re quite bullish on web3-native AI use-cases, especially as AI agents become more prevalent. We imagine a future where any given web3 user has a multitude of AI agents assisting them. The early category leader for this is Theoriq (backed by Hack VC), which enables composable and autonomous on-chain AI agents.

Idea #4: Consumer-grade GPU DePINs

There are a number of decentralized AI compute networks that rely on consumer-grade GPUs rather than data centers. Consumer GPUs are handy for low-end AI inference tasks or for consumer use cases where latency, throughput, and reliability are flexible. But for serious enterprise use cases (which is the majority of the market that matters), customers want a higher reliability network compared to people’s home machines, and often need higher-end GPUs if they have more complex inference tasks. Data-centers are more appropriate for these more valuable customer use cases.

Note that we view consumer-grade GPUs as useful for demo purposes or for individuals and startups who can tolerate lower reliability. But those customers are fundamentally less valuable, so we believe DePINs that cater to web2 enterprises will be more valuable longer term. As such, the well-known GPU DePIN projects have generally evolved from the early days of mostly consumer-grade hardware into having A100/H100 and cluster-level availability.

Reality—the practical and realistic use cases of crypto x AI

Now, let’s discuss the use cases that provide “real benefits.” These are actual “wins” where crypto x AI can add significant value.

Real Benefit #1: Serving web2 customers

McKinsey estimates that generative AI could add the equivalent of $2.6 trillion to $4.4 trillion annually across the 63 use cases they analyzed—by comparison, the United Kingdom’s entire GDP in 2021 was $3.1 trillion. This would increase the impact of all artificial intelligence by 15% to 40%. This estimate would roughly double if we include the impact of embedding generative AI into software that is currently used for other tasks beyond those use cases.

If you do the math on the above estimate, it implies that the total market for AI (beyond generative AI) could be worth in the tens of trillions of dollars, worldwide. By way of comparison, all of the cryptocurrencies combined, including Bitcoin and every alt coin, are only worth around $2.7 trillion today. So let’s be real here: the vast majority of the customers who need AI in the short term are going to be web2 customers, since the web3 customers who actually need AI will be a tiny slice of this $2.7 trillion (consider that BTC is half of this market, and BTC itself doesn’t need/use AI).

Web3 AI use-cases are just getting started and it’s not at all clear what the size of that market will be. But one thing is intuitively certain—it will be a small fraction of the web2 market for the foreseeable future. We believe web3 AI still has a bright future, but it simply means the most powerful application of web3 AI is, for now, to serve web2 customers.

Examples of web2 customers who could hypothetically benefit from web3 AI include:

Vertical-specific software companies that are built from the ground up to be AI-centric, (e.g., Cedar.ai or Observe.ai)
Large enterprises who are fine tuning models for their own purposes (e.g., Netflix)
Fast growing AI providers (e.g., Anthropic)
Software companies who are sprinkling AI into their existing products (e.g., Canva)

This is a relatively stable customer persona, since the customers are generally large and valuable. They aren’t likely going out of business anytime soon and they represent very large potential customers for AI services. Web3 AI services which serve web2 customers will benefit from this stable customer base.

But why would a web2 customer want to use a web3 stack? The remainder of this post makes that case.

Real Benefit #2: Driving down the costs of GPU usage via GPU DePINs

GPU DePINs aggregate under-utilized GPU compute power (the most reliable of which come from data centers) and make them available for AI inference (an example of this is io.net, which is a portfolio company of funds managed by Hack VC). A simple way to think about this is “Airbnb for GPUs” (effectively, collaborative consumption of under-utilized assets).

The reason we’re excited about GPU DePINs is that, as noted above, there’s a shortage of NVIDIA chips, and there are currently wasted GPU cycles that can be used for AI inference. These hardware owners have a sunk cost and are not making full use of their equipment today, and can therefore offer those fractional GPU cycles at a much lower cost compared to the status quo, since it’s effectively “found money” for the hardware owners.

Examples include:

AWS machines. If you were to rent an H100 from AWS today, you’d have to commit to a 1-year lease, because the market is supply constrained. This produces waste, as you likely aren’t going to use your GPU 365 days per year, 7 days per week.
Filecoin mining hardware. Filecoin is a network with a large amount of subsidized supply but not a significant amount of real demand. Filecoin unfortunately never found true product-market fit, and so Filecoin miners are in danger of going out of business. Those machines have GPUs on them and can be repurposed for lower end AI inference tasks.
ETH mining hardware. When ETH transitioned from PoW to PoS, that immediately made a large amount of hardware available that could be repurposed for AI inference.

Note that not all GPU hardware is appropriate for AI inference. One glaring reason for this is that older GPUs don’t have the necessary amount of GPU memory for LLMs, although there have been some interesting innovations to help here. Exabits, for example, has tech that loads active neurons into GPU memory and inactive neurons into CPU memory. They predict which neurons need to be active / inactive. This allows lower end GPUs to process AI workloads, even with limited GPU memory. This effectively makes lower end GPUs more useful for AI inference.

Note also that web3 AI DePINs will need to harden their offerings over time and offer enterprise-class services offerings, such as single sign-on, SOC 2 compliance, service-level agreements (SLAs), and more. This would mirror the services under the current cloud offerings that web2 customers currently enjoy.

Real Benefit #3: Uncensored models to avoid OpenAI self-censorship

There’s been much talk of AI censorship. Turkey, for example, temporarily banned OpenAI (they later reversed course on that once OpenAI improved its compliance). We believe this sort of country-level censorship is fundamentally uninteresting since countries will need to embrace AI to remain competitive.

What’s more interesting is that OpenAI self-censors itself. For example, OpenAI will not handle NSFW content. Nor will OpenAI predict the next presidential election. We think there’s an interesting and large market for use cases of AI that OpenAI will not touch for political reasons.

Open sourcing is a great solution for this, since a Github repo isn’t beholden to shareholders nor a board. An example of this is Venice.ai, which promises to preserve your privacy as well as operate in an uncensored manner. The key, of course, is being open source, which powers this. What web3 AI can effectively bring to up-level this is powering these open-source software (OSS) models at a lower-cost GPU cluster to perform that inference. It’s for these reasons we believe OSS + web3 are the ideal combination to pave the way for uncensored AI.

Real Benefit #4: Avoiding sending personally identifiable information to OpenAI

Many large enterprises have privacy concerns about their internal enterprise data. For these customers, it can be exceedingly difficult to trust a centralized third party, like OpenAI, with that data.

With web3, it may appear (on the surface) even more scary for these enterprises, since their internal data is suddenly on a decentralized network. There are, however, some innovations in privacy enhancing technologies for AI:

Trusted Execution Environments (TEE) such as the Super protocol
Fully Homomorphic Encryption (FHE) such as Fhenix.io (a portfolio company of a fund managed by Hack VC) or Inco Network (each of whom are powered by Zama.ai), and Bagel’s PPML

These technologies are still evolving, and the performance is still improving via upcoming zero knowledge (ZK) and FHE ASICs. But the long-term goal is to protect that enterprise data when fine tuning a model. As these protocols emerge, web3 may become a more attractive venue for privacy-preserving AI compute.

Real Benefit #5: Take advantage of the latest innovations in open-source models

OSS has consistently eroded the market share of proprietary software for the past few decades. We look at an LLM as simply a fancy form of proprietary software that’s ripe for OSS disruption. A few noteworthy examples of challengers include Llama, RWKV, and Mistral.ai. This list will undoubtedly grow as time progresses (a more comprehensive list is available at Openrouter.ai). By harnessing web3 AI (powered by OSS models) one can take advantage of these new innovations.

We believe that as time progresses, an open-source global development work force, combined with crypto incentives, can drive fast innovation in open-source models, as well as the agents and frameworks built on top of them. An example of an AI agent protocol is Theoriq. Theoriq harnesses OSS models to create a composable interconnected web of AI agents that can be assembled to create higher-level AI solutions.

The reason we have conviction here is due to the past: most “developer software” has slowly been out-innovated by OSS over time. Microsoft used to be a proprietary software company, and now they’re the #1 company contributing to Github, and there’s a reason for that. If you look at how Databricks, PostGresSQL, MongoDB, and others have disrupted proprietary databases, that’s an example of an entire industry that was upended by OSS, so the precedent here is quite strong.

This does, however, come with a catch. One of the tricky things with OSS LLMs is that OpenAI has started to create paid data licensing agreements with organizations, such as Reddit and the New York Times. If this trend continues, it may become more difficult for OSS LLMs to compete due to the financial barrier involved with them acquiring data. It may be possible that Nvidia doubles down on confidential computing as a secure data-sharing enabler. Time will tell how this pans out.

Real Benefit #6: Consensus achieved via random sampling with high slashing costs, or via ZK proofs

One of the challenges with web3 AI inference is verification. There’s a hypothetical opportunity for validators to cheat on their results to earn fees, so verifying inferences is an important measure. Note that this cheating hasn’t actually happened yet, because AI inference is in its infancy, but it’s inevitable unless measures are taken to disincentivize that behavior.

The standard web3 approach is to have multiple validators repeat the same operation and compare the results. The glaring challenge with this is, as noted, AI inference is expensive due to the current shortage of high-end Nvidia chips. Given that web3 can offer lower-cost inference via under-utilized GPU DePINs, redundant computation would severely undercut the web3 value proposition.

A more promising solution is to perform a ZK proof for off-chain AI inference computation. In this case, the succinct ZK proof can be verified to determine that a model was trained properly, or that inference was run properly (known as zkML). Examples include Modulus Labs and ZKonduit. The performance of these solutions are still nascent since ZK operations are quite compute intensive. However, we anticipate this will likely improve as ZK hardware ASICs are released in the near future.

Even more promising is the idea of a somewhat “optimistic” sampling-based AI inference approach. In this model, you would verify just a tiny percentage of the results generated by the validators, but set the slashing economic cost high enough such that, if caught, it would create a strong economic disincentive for validators to cheat. In this way, you’re saving on redundant compute (e.g., see Hyperbolic’s Proof of Sampling paper).

Another promising idea is a watermarking and fingerprinting solution, such as one proposed by the Bagel Network. This is similar to Amazon Alexa’s mechanism for quality assurance of on-device AI models for their millions of devices.

Real Benefit #7: Saving on fees (OpenAI’s margin) via OSS

The next opportunity web3 brings to AI is to democratize costs. So far, we’ve talked about saving GPU costs via DePINs. But web3 also offers opportunities to save on profit margins of centralized web2 AI services (e.g., OpenAI, which is doing over $1B/year in revenue as of the time of this writing). These cost savings come from the fact that OSS models are being used rather than proprietary models for an additional layer of savings since the model creator isn’t attempting to turn a profit.

Many OSS models will remain completely free, which enables the best possible economics for customers. But there may be some OSS models that attempt these monetization methods too. Consider that only 4% of the total models on Hugging Face are trained by companies with budgets to help subsidize the models (see here). The remaining 96% of models are trained by the community. This cohort – 96% of Hugging Face – have fundamental real costs (including compute costs and data costs). So those models will somehow need to monetize.

There’s a number of proposals for accomplishing this OSS monetization of models. One of the most interesting is the concept of an “Initial Model Offering” (IMO) where you tokenize the model itself, hold back a percentage of tokens for the team, and flow some future revenues from that model to the token holders, although there are clearly some legal and regulatory hurdles there.

Other OSS models will attempt to monetize on usage. Note that if this comes to fruition, the OSS models may begin to increasingly resemble their web2 profit-generating counterparts. But, realistically, the market will be bifurcated, with some models remaining completely free.

Real Benefit #8: Decentralized data sourcing

One of the largest challenges with AI is sourcing the right data to train your models. We mentioned earlier that decentralized AI training has its challenges. But what about using a decentralized network to source data (which can then be used for training elsewhere, even in traditional web2 venues)?

This is exactly what startups such as Grass are doing. Grass is a decentralized network of “data scrapers,” individuals who contribute their machine’s idle processing power towards sourcing data to inform training of AI models. Hypothetically, at scale, this data sourcing can be superior to any one company’s internal efforts to source data due to the sheer power of a large network of incentivized nodes. This includes not just sourcing more data, but sourcing that data more frequently so that the data is more relevant and up-to-date. It’s also virtually impossible to stop a decentralized army of data-scrapers, since they’re inherently fragmented and don’t reside within a single IP address. They also have a network of humans who can clean and normalize the data, so that it’s useful after being scraped.

Once you have the data, you also need a place to store it on-chain, as well as the LLMs that are generated with that data. 0g.AI is the early leader in this category. It’s an AI-optimized high performance web3 storage solution that is significantly cheaper than AWS (another economic win for Web3 AI), while also serving as data availability infrastructure for Layer 2s, AI, and more.

Note that the role of data may be changing in web3 AI in the future. Today, the current status quo for LLMs is to pre-train a model with data, and to refine it over-time with more data. However, those models are always slightly out-of-date since the data on the Internet is changing in real time. So the responses from the LLM inference are slightly inaccurate.

The future of where they world may be headed is a new paradigm – "realtime" data. The concept is when an LLM is asked an inference question, that LLM can use prompt injection of data into the LLM, where that data is gathered in real-time from the Internet. In that way, the LLM uses the most up-to-date data possible. Grass is researching this as well.

Conclusion

We hope this serves as a useful analysis for you when thinking about promises vs realities of web3 AI. This is just a starting point for conversation, and the landscape is quickly changing, so please feel free to chime in and express your views too, as we’d love to keep learning and building together.

Acknowledgements

A very special thanks to Albert Castellana, Jasper Zhang, Vassilis Tziokas, Bidhan Roy, Rezo, Vincent Weisser, Shashank Yadav, Ali Husain, Nukri Basharuli, Emad Mostaque, David Minarsch, Tommy Shaughnessy, Michael Heinrich, Keccak Wong, Marc Weinstein, Phillip Bonello, Jeff Amico, Ejaaz Ahamadeen, Evan Feng, and JW Wang for their feedback and contributions to this post.

The information herein is for general information purposes only and does not, and is not intended to, constitute investment advice and should not be used in the evaluation of any investment decision. Such information should not be relied upon for accounting, legal, tax, business, investment, or other relevant advice. You should consult your own advisers, including your own counsel, for accounting, legal, tax, business, investment, or other relevant advice, including with respect to anything discussed herein.

This post reflects the current opinions of the author(s) and is not made on behalf of Hack VC or its affiliates, including any funds managed by Hack VC, and does not necessarily reflect the opinions of Hack VC, its affiliates, including its general partner affiliates, or any other individuals associated with Hack VC. Certain information contained herein has been obtained from published sources and/or prepared by third parties and in certain cases has not been updated through the date hereof. While such sources are believed to be reliable, neither Hack VC, its affiliates, including its general partner affiliates, or any other individuals associated with Hack VC are making representations as to their accuracy or completeness, and they should not be relied on as such or be the basis for an accounting, legal, tax, business, investment, or other decision. The information herein does not purport to be complete and is subject to change and Hack VC does not have any obligation to update such information or make any notification if such information becomes inaccurate.

Past performance is not necessarily indicative of future results. Any forward-looking statements made herein are based on certain assumptions and analyses made by the author in light of his experience and perception of historical trends, current conditions, and expected future developments, as well as other factors he believes are appropriate under the circumstances. Such statements are not guarantees of future performance and are subject to certain risks, uncertainties, and assumptions that are difficult to predict.

Sign up for more like this.