Monday, December 23, 2024

WWDC 2024: Apple Intelligence

Must read

An oft-told story is that back in 2009 — two years after Dropbox debuted, two years before Apple unveiled iCloud — Steve Jobs invited Dropbox cofounders Drew Houston and Arash Ferdowsi to Cupertino to pitch them on selling the company to Apple. Dropbox, Jobs told them, was “a feature, not a product”.

It’s easy today to forget just how revolutionary a product Dropbox was. A simple installation on your Mac and boom, you had a folder that synced between every Mac you used — automatically, reliably, and quickly. At the time Dropbox had a big sign in its headquarters that read, simply, “It Just Works”, and they delivered on that ideal — at a time when no other sync service did. Jobs, of course, was trying to convince Houston and Ferdowsi to sell, but that doesn’t mean he was wrong that, ultimately, it was a feature, not a product. A tremendously useful feature, but a feature nonetheless.

Leading up to WWDC last week, I’d been thinking that this same description applies, in spades, to LLM generative AI. Fantastically useful, downright amazing at times, but features. Not products. Or at least not broadly universal products. Chatbots are products, of course. People pay for access to the best of them, or for extended use of them. But people pay for Dropbox too.

Chatbots can be useful. There are people doing amazing work through them. But they’re akin to the terminal and command-line tools. Most people just don’t think like that.

What Apple unveiled last week with Apple Intelligence wasn’t so much new products, but new features — a slew of them — for existing products, powered by generative AI.

Safari? Better now, with generative AI page summaries. Messages? More fun, with Genmoji. Notes and Mail and Pages (and any other app that uses the system text frameworks)? Better now, with proofreading and rewriting tools built-in. Photos? Even better recommendations for memories, and automatic categorization of photos into smart collections. Siri? That frustrating, dumb-as-a-rock son of a bitch, Siri? Maybe, actually, pretty useful and kind of smart now. These aren’t new apps or new products. They’re the most used, most important apps Apple makes, the core apps that define the Apple platforms ecosystem, and Apple is using generative AI to make them better and more useful — without, in any way, rendering them unfamiliar.1

We had a lot of questions about Apple’s generative AI strategy heading into WWDC. Now that we have the answers, it all looks very obvious, and mostly straightforward. First, their models are almost entirely based on personal context, by way of an on-device semantic index. In broad strokes, this on-device semantic index can be thought of as a next-generation Spotlight. Apple is focusing on what it can do that no one else can on Apple devices, and not really even trying to compete against ChatGPT et al for world-knowledge context. They’re focusing on unique differentiation, and eschewing commoditization.

Second, they’re doing both on-device processing, for smaller/simpler tasks, and cloud processing (under the name Private Cloud Compute) for more complex tasks. All of this is entirely Apple’s own work: the models, the servers (based on Apple silicon), the entire software stack running on the servers, and the data centers where the servers reside. This is an enormous amount of work, and seemingly puts the lie to reports that Apple executives only even became interested in generative AI 18 months ago. And if they did accomplish all this in just 18 months that’s a remarkable achievement.

Anyone can make a chatbot. (And, seemingly, everyone is — searching for “chatbot” in the App Store is about as useful as searching for “game”.) Apple, conspicuously, has not made one. Benedict Evans keenly observes:

To begin, then: Apple has built an LLM with no chatbot. Apple has
built its own foundation models, which (on the benchmarks
it published) are comparable to anything else on the market, but
there’s nowhere that you can plug a raw prompt directly into the
model and get a raw output back – there are always sets of buttons
and options shaping what you ask, and that’s presented to the user
in different ways for different features. In most of these
features, there’s no visible bot at all. You don’t ask a question
and get a response: instead, your emails are prioritised, or you
press ‘summarise’ and a summary appears. You can type a request
into Siri (and Siri itself is only one of the many features using
Apple’s models), but even then you don’t get raw model output
back: you get GUI. The LLM is abstracted away as an API call.

Instead Apple is doing what no one else can do: integrating generative AI into the frameworks in iOS and MacOS used by developers to create native apps. Apps built on the system APIs and frameworks will gain generative AI features for free, both in the sense that the features come automatically when the app is running on a device that meets the minimum specs to qualify for Apple Intelligences, and in the sense that Apple’s isn’t charging developers or users to utilize these features.

Apple’s keynote presentation was exceedingly well-structured and paced. But nevertheless it was widely misunderstood, I suspect because expectations were so wrong. Those who believed going in that Apple was far behind the state of the art in generative AI technology wrongly saw the keynote’s coda — the announcement of a partnership with OpenAI to integrate their latest model, ChatGPT-4o, as an optional “world knowledge” layer sitting atop Apple’s own homegrown Apple Intelligence — as an indication that most or even all of the cool features Apple revealed were in fact powered by OpenAI. Quite the opposite. Almost nothing Apple showed in the keynote was from ChatGPT.

What I see as the main takeaways:

  • Apple continues to build machine learning and generative AI features across its core platforms, iOS and MacOS. They’ve been adding such features for years, and announced many new ones this year. Nothing Apple announced in the entire first hour of the keynote was part of “Apple Intelligence”. Math Notes (freeform handwritten or typed mathematics, in Apple Notes and the Calculator app, which is finally coming to iPadOS) is coming to all devices running iOS 18 and MacOS 15 Sequoia. Smart Script — the new personalized handwriting feature when using Apple Pencil, which aims to improve the legibility of your handwriting as you write, and simulates your handwriting when pasting text or generating answers in Math Notes — is coming to all iPads with an A14 or better chip. Inbox categorization and smart message summaries are coming to Apple Mail on all devices. Safari web page summaries are coming to all devices. Better background clipping (“greenscreening”) for videoconferencing. None of these features are under the “Apple Intelligence” umbrella. They’re for everyone with devices eligible for this year’s OS releases.

  • The minimum device specs for Apple Intelligence are understandable, but regrettable, particularly the fact that the only current iPhones that are eligible are the iPhone 15 Pro and Pro Max. Even the only-nine-month-old iPhone 15 models don’t make the cut. When I asked John Giannandrea (along with Craig Federighi and Greg Joswiak) about this on stage at The Talk Show Live last week, his answer was simple: lesser devices aren’t fast enough to provide a good experience. That’s the Apple way: better not to offer the feature at all than offer it with a bad (slow) experience. A-series chips before last year’s A17 Pro don’t have enough RAM and don’t have powerful enough Neural Engines. But by the time Apple Intelligence features actually become available — even in beta form (they are not enabled in the current developer OS betas) — the iPhone 15 Pro will surely be joined by all iPhone 16 models, both Pro and non-pro. Apple Intelligence is skating to where the puck is going to be in a few years, not where it is now.

  • Surely Apple is also being persnickety with the device requirements to lessen the load on its cloud compute servers. And if this pushes more people to upgrade to a new iPhone this year, I doubt Tim Cook is going to see that as a problem.

  • One question I’ve been asked repeatedly is why devices that don’t qualify for Apple Intelligence can’t just do everything via Private Cloud Compute. Everyone understands that if a device isn’t fast or powerful enough for on-device processing, that’s that. But why can’t older iPhones (or in the case of the non-pro iPhones 15, new iPhones with two-year-old chips) simply use Private Cloud Compute for everything? From what I gather, that just isn’t how Apple Intelligence is designed to work. The models that run on-device are entirely different models than the ones that run in the cloud, and one of those on-device models is the heuristic that determines which tasks can execute with on-device processing and which require Private Cloud Compute or ChatGPT. But, see also the previous item in this list — surely Apple has scaling concerns as well. As things stand, with only devices using M-series chips or the A17 or later eligible, Apple is going to be on the hook for an enormous amount of server-side computation with Private Cloud Compute. They’d be on the hook for multiples of that scale if they enabled Apple Intelligence for older iPhones, with those older iPhones doing none of the processing on-device. The on-device processing component of Apple Intelligence isn’t just nice-to-have, it’s a keystone to the entire thing.

  • Apple could have skipped, or simply delayed announcing until the fall, the entire OpenAI partnership, and they still would have had an impressive array of generative AI features with broad, practical appeal. And clearly they would have gotten a lot more credit for their achievements in the aftermath of the keynote. I remain skeptical that integrating ChatGPT (and any future world-knowledge LLM chatbot partners) at the OS level will bring any significant practical advantage to users versus just using the chatbot apps from the makers of those LLMs. But perhaps removing a few steps, and eliminating the need to choose, download, and sign up for a third-party chatbot, will expose such features to many more users than who are using them currently. But I can’t help but feel that integrating these third-party chatbots in the OSes is at least as much a services-revenue play as a user-experience play.

  • The most unheralded aspect of Apple Intelligence is that the data centers Apple is building for Private Cloud Compute are not only carbon neutral, but are operating entirely on renewable energy sources. That’s extraordinary, and I believe unique in the entire industry. But it’s gone largely un-remarked-upon — because Apple itself did not mention this during the WWDC keynote. Craig Federighi first mentioned it in a post-keynote interview with Justine Ezarik, and he reiterated on stage with me at The Talk Show Live From WWDC. In hindsight, I wish I’d asked, on stage, why Apple did not even mention this during the keynote, let alone trumpet it. I suspect the real answer is that Apple felt like they couldn’t brag about their own data centers running entirely on renewable energy during the same event in which they announced a partnership with OpenAI, whose data centers can make no such claims. OpenAI’s carbon footprint is a secret, and experts suspect it’s bad. It’s unseemly to throw your own partner under the bus, but that takes Apple Intelligence’s proclaimed carbon neutrality off the table as a marketing point. Yet another reason why I feel Apple might have been better off not announcing this partnership last week.

  • If you don’t want or don’t trust Apple Intelligence (or just not yet), you’ll be able to turn it off. And you’ll have to opt-in to using the integrated ChatGPT feature, and, each time Apple Intelligence decides to send you to ChatGPT to handle a task, you’ll have to explicitly allow it. As currently designed, no one is going to accidentally interact with, let alone expose personal information to, ChatGPT. If anything I suspect the more common complaint will come from people who wish to use ChatGPT without confirmation each time. At present there’s no “Always allow” option, but some people are going to want one.

  • At a technical level Apple is using indirection to anonymize devices from ChatGPT. OpenAI will never see your IP address or precise location. At a policy level, OpenAI has agreed not to store user data, nor use data for training purposes, unless users have signed into a ChatGPT account. If you want to use Apple Intelligence but not ChatGPT, you can. If you want to use ChatGPT anonymously, you can. And if you do want ChatGPT to keep a history of your interactions, you can do that too, by signing in to your account. Users are entirely in control, as they should be.

  • VisionOS 2 is not getting any Apple Intelligence features, despite the fact that the Vision Pro has an M2 chip. One reason is that VisionOS remains a dripping-wet new platform — Apple is still busy building the fundamentals, like rearranging and organizing apps in the Home view. VisionOS 2 isn’t even getting features like Math Notes, which, as I mentioned above, isn’t even under the Apple Intelligence umbrella. But another reason is that, according to well-informed little birdies, Vision Pro is already making significant use of the M2’s Neural Engine to supplement the R1 chip for real-time processing purposes — occlusion and object detection, things like that. With M-series-equipped Macs and iPads, the Neural Engine is basically sitting there, fully available for Apple Intelligence features. With the Vision Pro, it’s already being used.

  • “Apple Intelligence” is not one thing or one model. Or even two models — local and cloud. It’s an umbrella for dozens of models, some of them very specific. One of the best, potentially, is a new model that will allow Siri to answer technical support questions about Apple products and services. This model has been trained on Apple’s own copious Knowledge Base of support documentation. You can’t say “no reads the documentation” any more — Siri is reading it. Apple’s platforms are so rich and deep, but most users’ knowledge of them is shallow; getting correct answers from Siri to specific how-to questions could be a game-changer. AI-generated slop is polluting web search results for technical help; Apple is using targeted AI trained on its own documentation to avoid the need to search the web in the first place. Technical documentation isn’t sexy, but exposing it all through natural language queries could be one of the sleeper hits of this year’s announcements.

  • Xcode is the one product where Apple was clearly behind on generative AI features. It was behind on LLM-backed code completion/suggestion/help last year. Apple introduced two generative AI features in Xcode 16, and they exemplify the local/cloud distinction in Apple Intelligence in general. Predictive code completion runs locally, on your Mac. Swift Assist is more profound, answering natural language questions and providing entire solutions in working Swift code, and runs entirely in Private Cloud Compute.

Take It All With a Grain of Salt

Lastly, it is essential to note that we haven’t been able to try any of these Apple Intelligence features yet. None of them are yet available in the developer OS betas, and none are slated to be available, even in beta, until “later this summer”. I witnessed multiple live demos of some of these features last week, during press briefings at Apple Park after the keynote. Demos I witnessed included the writing tools (“make this email sound more professional”) and Xcode code completion and Swift Assist. But those demos were conducted by Apple employees; we in the media were not able to try them ourselves.

It all looks very impressive, and almost all these features seem very practical. But it’s all very, very early. None of it counts as real until we’re able to use it ourselves. We don’t know how well it works. We don’t know how will it scales.

If generative AI weren’t seen as essential — both in terms of consumer marketing and investor confidence — I think much, if not most, of what Apple unveiled in “Apple Intelligence” wouldn’t even have been announced until next year’s WWDC, not last week’s WWDC. Again, none of the features in “Apple Intelligence” are even available in beta yet, and I think all or most of them will be available only under a “beta” label until next year.

It’s good to see Apple hustling, though. I continue to believe it’s incorrect to see Apple as “behind”, overall, on generative AI. But clearly they are feeling tremendous competitive pressure on this front, which is good for them, and great for us.

Latest article