NVIDIA and new AI paradigm!

NVIDIA is no longer a graphic computing company; it's an AI inference machine...

How did we get here? Let's proceed in order.

A quick caveat before moving ahead: no information in this newsletter should be deemed as financial advice! The analyses I send you are meant to give you some understanding of the long-term strategy of the companies I analyze, outside short-term stock price fluctuations, of which I don't know and don't care. I hope that if you're an executive, you can learn from these analyses to make better decisions and be ahead of the commercial curve.

In the latest earnings release, Jensen Huang, CEO and co-founder of NVIDIA, highlighted three key paradigm shifts, which we're looking at right now, and that are fundamentally shaking the whole software industry from within:

  • Paradigm Shift 1: The move from general to accelerated computing will dramatically improve energy efficiency and cost by 20x, improving speed by a step change magnitude too.

  • Paradigm Shift 2: Generative AI as a new fundamental way of doing software and a new way of computing, thus redefining the whole cloud industry (from retrieval to inference).

  • Paradigm Shift 3: A whole new industry from hardware to software; for the first time, a data center is not just about computing data and storing data; there is a new type of data center, which is about AI generation.

In the last couple of years, I've been explaining this in what I defined as a new "AI Business Ecosystem."

Let's tackle these three waves. Let's tackle first the basics, then we move to the advanced stuff...

Shift 1: The GPU isn't just a new chip

In the last few years, I've been explaining, over here, how the hardware part has become a critical component of the success of prominent tech players like Google, Apple, Meta, and Amazon.

I explained it in detail in the AI Supply Chain piece, which I wrote back in 2020 and updated last year.

Both GPU and TPU are critical components of an AI supercomputer.

The GPU or graphic processing unit is a powerful chip that can perform parallelized computing, primarily used in gaming. It found itself to be the perfect architecture for the current AI paradigm.

Graphics processing units (GPUs) were initially conceived to accelerate 3D graphic rendering in video games. However, more recently, they have become popular in artificial intelligence and machine learning (ML) contexts.

GPUs are critical components of AI Supercomputers, like Azure, which are powering up the current AI revolution.

Another version of the GPU is the TPU or tensor processing unit, which is similar to the GPU and is a powerful chip well-suited for training large language models.

The TPU was specifically developed by Google to be optimized around AI training.

A tensor processing unit (TPU) is a specialized integrated circuit developed by Google for neural network machine learning.

The TPU is a critical component of Google's AI Supercomputer, which enables the company to develop large language models that are spurring up the current AI revolution.

When you stack up (a few years ago a few hundred) these GPUs, that's how you get an AI supercomputer.

How hundreds of TPUs make up a Google's AI Supercomputer - Credit: MIT Technology Review

Microsoft's Azure AI Supercomputing Facility - Credit: Microsoft

Of course, there is way more to it, as there are various hardware architectures to follow to build a powerful AI Supercomputer.

In addition to that, right now, an AI Supercomputer, to be competitive, needs to employ thousands of GPUs or TPUs.

But the key point is that the GPU isn't just a new chip; it's a software platform...

In the latest earnings, Jensen Huang explained why the GPU isn't just a chip but way more than that. It becomes a software platform:

NVIDIA GPUs is like a chip. But the NVIDIA Hopper GPU has 35,000 parts. It weighs 70 pounds. These things are really complicated things we've built. People call it an AI supercomputer for good reason. If you ever look in the back of the data center, the systems, the cabling system is mind boggling. It is the most dense complex cabling system for networking the world's ever seen.

The more Generative AI integrates into anything, the more the hardware part (for major tech players) becomes the critical moat.

In addition to that, the underlying cloud infrastructure, which serves the Generative AI paradigm, also shifts toward inferencing!

Indeed, as you can see from the above, NVIDIA's revenue from computing more than tripled in a single year!

Can you guess why? It's the new inference paradigm!

As Jensen Huang highlighted in the latest earnings release:

One, the amount of inference that we do is just off the charts now. Almost every single time you interact with ChatGPT, that we're inferencing. Every time you use Midjourney, we're inferencing. Every time you see amazing -- these Sora videos that are being generated or Runway, the videos that they're editing, Firefly, NVIDIA is doing inferencing. The inference part of our business has grown tremendously. We estimate about 40%. The amount of training is continuing, because these models are getting larger and larger, the amount of inference is increasing.

Paradigm Shift 2: AI Supercomputers turn the cloud into an AI Generation Factory

A vital element of the current AI landscape is the ability of large language models to be pre-trained in an unsupervised manner.

AI models can learn from large amounts of unlabelled/unstructured data by turning them into tokens.

Before, you needed hundreds of humans to manually and carefully curate that data to make it useful, in the first place, to develop AI models.

Fundamentally, the whole process translates raw materials (data) through accelerated computing to turn them into tokens, which is the language of Generative AI. These tokens are generated in a specialized data center, a supercomputing data center, or an AI generation factory.

Image Credit: Microsoft

That's why everyone who wants to seriously compete in the new AI industry must look into producing its own chips. And to give you some context, to even be able to pre-train a large language model at the levels of the most advanced ones on the market today (like GPT-4 or Gemini 1.5), you need a few billion dollars to start!

Microsoft is reportedly developing its own AI chips for training large language models, a project that has been kept secret since 2019.

The company aims to reduce reliance on Nvidia, the current key supplier of AI server chips, and cut costs associated with deploying AI software.

For some context, OpenAI is estimated to need over 30,000 of Nvidia's A100 GPUs for commercializing ChatGPT, and Nvidia's H100 GPUs are in high demand, selling for over $40,000 on eBay.

This means a cost of almost a billion to start!

This is where we are in terms of resources needed to be competitive on the foundational layer...

And things are getting even more competitive, where even to pre-train a model, at the level of GPT-4, a company might need a few billion dollars in GPUs!

That is why Microsoft's project, codenamed Athena, involves building in-house AI chips, which may be made available within Microsoft and OpenAI as early as next year.

Microsoft's AI chips are not direct replacements for Nvidia's, but they could significantly reduce costs for Microsoft's AI-powered features in Bing, Office apps, GitHub, and more.

Microsoft has also been exploring the design of ARM-based chips for servers and potential Surface devices.

Other tech giants, including Amazon, Google, and Meta, have developed their own in-house AI chips, but many companies still rely on Nvidia chips for large language models.

In AI is eating software, I explained in detail how the quote from NVIDIA's CEO, Jensen Huang, "Software is eating the world, but AI is going to eat software," is playing out.

I also explained why this trend was a continuation and the last leg of a movement that Marc Andreessen emphasized in 2011 about Software eating the world.

That's how you want to frame the current AI revolution!

In short, we'll see in the coming decade what's the maximum potential we can get by transforming anything into a software paradigm, where anything moves from dumb to smart, from static to dynamic, and from generalized to hyper-personalized!

And yet, there is a paradox to this revolution.

Paradigm Shift 3: The Emergence of A Whole New Industry (Generative AI Native)

As Jensen Huang emphasized, NVIDIA enabled, a whole new computing paradigm, generative AI, where software can learn, understand and generate any information from human language to the structure of biology and the 3D world.

He also highlighted how with accelerated computing:

You can dramatically improve your energy efficiency. You can dramatically improve your cost in data processing by 20 to 1. Huge numbers. And of course, the speed. That speed is so incredible that we enabled a second industry-wide transition called generative AI.

How does this translate at a consumer level?

You see consumer Internet services that are now augmenting all of their services of the past with generative AI. So they can have even more hyper-personalized content to be created.

Thus:

Generative AI really becoming a whole new application space, a whole new way of doing computing, a whole new industry is being formed and that's driving our growth.

How will this happen?

Every company in every industry is fundamentally built on their proprietary business intelligence, and in the future, their proprietary generative AI.

In short, as I've been highlighting as well over here, in the beginning, incumbents will be the ones benefiting the most from the adoption of Generative AI. Still, eventually, this will turn into a whole new thing... Let me take the snippet I've been using in the last two years to emphasize that below:

Many discussions today around AI look into the technical aspect but are missing the broader picture.

Indeed, while the technical aspect of LLMs might play a role in developing the AI industry, there is a more critical aspect to consider.

One example playing out as we speak is the transition from search to Generative AI. Many technologists look at it and think the whole world will adopt the new generative AI paradigm in a few years.

My argument instead goes in the opposite direction. It tells you that for Generative AI to go mainstream, it needs first to become part of the existing demand and, from there, expand it many times over!

That tells you that incumbents are the ones who will make the most of this first wave initially. But over time, native generative AI companies will dominate the market!

That doesn't mean all incumbents will disappear. It simply means we'll see a reshuffling of the competitive landscape (in 10-20 years) where new players will be dominating among of the big ones...

And that is a process that might require a decade, at least!

Recap: In This Issue!

  • NVIDIA's Paradigm Shifts: CEO Jensen Huang outlined three significant paradigm shifts in the software industry during NVIDIA's latest earnings release:

    • Shift 1: General to Accelerated Computing: Expected to enhance energy efficiency and cost by 20x over a decade, with significant speed improvements.

    • Shift 2: Generative AI Revolution: Transforming software and computing, particularly in cloud infrastructure, from retrieval to inference.

    • Shift 3: Emergence of a New Industry: Generative AI, enabled by accelerated computing, creating a new computing paradigm for various applications.

  • GPU Evolution: GPUs, initially designed for 3D graphic rendering in gaming, have become pivotal in AI and machine learning. TPUs, like Google's, specialize in neural network machine learning. Both are crucial for AI supercomputers.

  • AI Supercomputers and Inference: Inference, the process of deriving insights from data, has seen exponential growth, especially with applications like ChatGPT and video generation. NVIDIA's revenue from computing has tripled, largely due to inference.

  • Cloud as AI Generation Factory: Large language models can now be pre-trained unsupervised, turning raw data into tokens. Companies like Microsoft are investing in developing their own AI chips to reduce reliance on Nvidia and cut costs.

  • Future of AI Revolution: The AI revolution will see a transformation from static to dynamic and generalized to hyper-personalized software paradigms. Generative AI will become a new application space, driving industry growth and reshuffling the competitive landscape.

  • Market Expansion Theory: Generative AI's mainstream adoption will initially benefit incumbents but will eventually lead to the dominance of native generative AI companies, reshaping the competitive landscape over the next decade or two.

Ciao!

With ♥️ Gennaro, FourWeekMBA