Sunday, December 22, 2024

The next generation of AI devices is here. Who will be its Nvidia?

Must read

“The very next thing we are going to see, and the gate for the horses to run this race, opened a few days ago,” he said. “It’s the AI PC and the AI smartphone.”

Pollak was referring to the Copilot+ PCs announced by Microsoft last week, which the company is describing as a “new era” in computers, are AI-centric laptops that, for the moment at least, run exclusively on Qualcomm chips.

Later this year they’ll also use chipsets from chipmakers such as Intel and AMD, but the first generation of Copilot+ PCs will all be Qualcomm based when they go on sale in June.

Microsoft Surface Pro devices will feature a new AI assistant known as Copilot, which the company hopes will boost sales. AP

Meanwhile, AI smartphones, which began to appear late last year when Google launched its Pixel 8 and accelerated early this year with Samsung launched its Galaxy S 24, should reach a crescendo next month when Apple is expected to announce AI changes to its iPhones. These types of devices either use Qualcomm chips or, in the case of those three particular companies, run chips that are in the same family as Qualcomm chips.

But to understand what all this has to do with Nvidia, one first needs to understand the difference between the two big consumers of processor power in the AI era: training, and inference.

Training an AI model, which in the case of the generative AI models being built by companies like Google, OpenAI, Microsoft and Meta, involves hoovering up all the world’s data and looking for statistical relationships between things, is enormously compute intensive, and has seen customers lining to fill their data centres with powerful Nvidia systems.

But inference, which involves taking a model and getting it to do something useful like writing an email for you, doesn’t require nearly as much compute. Inference, too, has typically been run centrally in huge data centres powered by Nvidia (or similar) chips, but that’s starting to change.

What the AI phones from Google, Samsung and (soon) Apple have in common – and what Microsoft’s Copilot+ PC also has – is that they all do AI inferencing locally, on low-powered chips inside the device rather than on high-powered chips in the cloud.

Training, by and large, is staying in the cloud, but inferencing is spreading out to the edge devices.

In order to qualify for the Copilot+ PC branding, for instance, laptops need to have an inferencing chip known as a Neural Processing Unit on them, capable of running 40 trillion operations per second, or 40 TOPS.

The Qualcomm Snapdragon X Elite chipset on the first generation of Copilot+ PCs will actually be capable of more than that: 45 TOPS.

That’s not a lot of compute power compared to the 800 TOPS Nvidia’s laptop GPUs are capable of, but Microsoft is betting it’s enough for AI inferencing, even if it’s not enough for AI training.

Indeed, to help inferencing run more effectively on consumer devices like AI PCs and AI phones, Microsoft, Google and others are training new, lightweight versions of their models that run fast on low-powered NPUs, but still have enough accuracy to satisfy consumers.

Microsoft’s Copilot+ PCs are going to have 40 different models of differing sizes, and similarly Google has multiple sizes for its Gemini model, some of which will be small enough to have their inferencing done “on device”, and some so big they still need to run in data centres in the cloud.

From an AI stock investment perspective, Loftus’ Pollak says it’s still very much an open question how much value this shift to NPU for inferencing will draw away from Nvidia, and hand to companies like Qualcomm.

But what it does do is open up the possibility of a whole new generation of apps, which take advantage of this local AI inferencing to produce results that were either impossible or impractical to achieve using the cloud.

Even if local inferencing of small models has the disadvantage of not being as accurate as cloud -based inferencing of large models, it has the distinct advantage of being fast, cheap and, above all, private.

Quizzed on which of those apps might be worth investing in, Pollak was reluctant to say. It’s early days, and we’re yet to see how app developers are going to take advantage of the new AI PCs and AI phones.

As it was with the dawn to the internet and the smartphone, most likely it will be apps no-one has even thought of yet.

Latest article