Apple is trying to distill Google’s massive Gemini model onto the iPhone to power the next Siri – but cloud offloading means your data leaves the device.
According to The Information, Apple’s Gemini-powered Siri will run both on-device AND in the cloud, relying on Google and Nvidia infrastructure. Apple has long touted on-device AI as a privacy advantage, but the reality is phone hardware simply cannot handle multi-trillion parameter models.
The approach: distillation – making smaller models mimic larger ones. This works for basic tasks, but complex queries still go to the cloud via Nvidia’s Confidential Computing platform.
Bottom line: your Siri conversations will leave your phone. Once data leaves your device, it is no longer “on-device” AI.
Source: https://arstechnica.com/ai/2026/05/apple-reportedly-trying-to-distill-googles-multi-trillion-parameter-gemini-ai-to-run-on-iphone/