[Unconfirmed] Apple Enlists Google Cloud to Scale "Gemini-Powered" Siri Amid Infrastructure Concerns
Apple Taps Google Cloud to Power the Next-Gen Siri as Internal Servers Face Bottlenecks.
To ensure user privacy, Apple Intelligence has traditionally relied on a two-tier processing model: On device processing for basic tasks and Private Cloud Compute (PCC) Apple’s custom silicon-powered servers for more complex requests. Currently, utilization of Apple’s PCC remains relatively low, hovering at just 10% of its total capacity.
However, a major shift is on the horizon. With the upcoming launch of an upgraded Siri later this year enhanced by Google’s Gemini model Apple anticipates a massive surge in demand. This projected traffic is so significant that Apple’s internal infrastructure may not be able to scale in time to meet it.
The Infrastructure Gap and the Google Partnership
According to a report from The Information, Apple has reportedly requested Google to deploy specialized servers within Google Cloud to support the Gemini-integrated Siri. This move is driven by two primary factors:
Capacity Constraints: Apple’s own cloud expansion may not keep pace with the rapid influx of users.
Hardware Obsolescence: Sources suggest that Apple’s current cloud hardware specs may be outdated, lacking the raw power required to handle the massive compute demands of large-scale Gemini models.
While Apple previously stated that all high-level processing would occur within its own Private Cloud, the time constraints have forced a collaboration. Notably, Google has its own privacy-centric infrastructure known as "Private AI Compute," which aligns with Apple’s stringent data protection requirements.
Apple's biggest challenge is maintaining its privacy brand image when using someone else's cloud, like Google Cloud. Therefore, Apple uses a technology called "Stateless Processing," which processes and immediately deletes data, preventing Google, the cloud provider, from accessing and interfering with the data. This is a new standard called Confidential Computing.
Although Apple is also in discussions with OpenAI, choosing Google Cloud for this large-scale project may stem from the readiness of Google's TPU (Tensor Processing Units) chips, which are recognized as having the best performance per watt for running large-scale Gemini models. Apple's M-series server chips may lag behind in this aspect.
This reflects Apple's acceptance of the reality that "Hardware is hard, but AI infrastructure is harder." Expanding data centers to meet global AI demands requires enormous capital and time. Partnering with a frenemy like Google is the smartest short-term solution to prevent the new Siri from experiencing crashes or delays on launch day.
If this model is successful, we may see a new standard shared by big tech companies. A shared "secure infrastructure" is built using third-party auditing to ensure user data does not leak, even when crossing the cloud.
Qualcomm Unveils Snapdragon Wear Elite Bringing 2B Parameter AI for wearable devices.
Source: The Information
Comments
Post a Comment