NVIDIA Hand-Delivers First Custom Vera CPUs to Accelerate Agentic AI.

- May 20, 2026

NVIDIA Commences Production Shipments of "Vera" CPU: Exec Hand-Delivers First Agentic Silicon to OpenAI, Anthropic, xAI, and Oracle

Marking a monumental shift in the semiconductor landscape, NVIDIA has officially transitioned its highly anticipated next-generation architecture from lab announcement to live production deployment. In a high-profile gesture reminiscent of the company’s earliest AI deliveries, Ian Buck, NVIDIA’s Vice President of Hyperscale and High-Performance Computing (HPC), personally hand-delivered the inaugural batch of NVIDIA Vera CPU systems directly to the Silicon Valley headquarters of the world's most prominent AI laboratories.

The Architecture Behind the "Agentic CPU Moment"

The standalone Vera CPU represents NVIDIA's first fully custom data center processor, engineered specifically to power the host infrastructure of the upcoming NVIDIA Vera Rubin computing platform. As frontier AI models move away from static text responses toward complex "Agentic AI" where background autonomous agents must concurrently execute programming code, parse vast databases, use web browsers, and run real-time simulations the operational strain has shifted heavily back toward high-throughput CPU processing.

To address this exact computational bottleneck, the Vera CPU introduces:

88 Custom "Olympus" Cores: NVIDIA's first fully bespoke server cores built on the Armv9.2 architecture, boasting up to a 50% single-threaded performance leap over legacy frameworks.
1.2 TB/s Memory Bandwidth: Massive LPDDR5X memory subsystem processing throughput—nearly doubling traditional CPU limits while slashing power consumption in half.
NVIDIA Spatial Multithreading (SMT): A specialized threading breakthrough that physically partitions core resources to handle 176 concurrent threads smoothly under full loading cycles.

Hand-Delivered to AI Royalty

The rollout began with Ian Buck executing a personal delivery tour across the San Francisco Peninsula. The first stop was at Anthropic's SoMa offices to present the server hardware to James Bradbury, Head of Compute. This was followed by a short commute to OpenAI's Mission Bay headquarters, where Buck met with Sachi Katti, Head of Compute Infrastructure, even pulling out a screwdriver on an open-air balcony to walk the team through the motherboard architecture.

The tour culminated at SpaceX-xAI's offices in Palo Alto, where tech billionaire Elon Musk was on-site to personally accept the delivery. xAI is reportedly evaluating the Vera CPU's extreme concurrency capabilities specifically to supercharge the reinforcement learning (RL) workloads and heavy agent-based simulation loops driving their training stacks. Meanwhile, Oracle Cloud Infrastructure (OCI) accepted its allocation shortly after, preparing to scale production-grade agentic environments across its global enterprise clouds.

The most interesting information comes from Ian Buck (creator of the CUDA programming language), indicating that AI is transitioning from "token generation" to "agent calling." The problem is that when AI performs calculations or writes applications, it needs to create a small, simulated lab in the backend (CPU sandbox environment) to run and test real Python code before returning the answer to the user. This type of task cannot be run on a GPU; it must be run on a CPU. The Vera chip was created to be a "thinking and acting accelerator," bridging the bottleneck of older servers.

Previously, NVIDIA relied on pre-built cores, but for Vera, they designed their own core, called Olympus, using a second-generation scalable coherency fabric (SCF) memory architecture. This connects 88 cores directly on a single compute die, eliminating cross-chiplet latency issues unlike other CPUs on the market. This ensures that even when hundreds of bots or agents are running simultaneously, the system remains stable and consistently fast.

The fact that a vice president carried a screwdriver and delivered a bare-board computer to a world-class CEO like Elon Musk wasn't just about service; it was a "high-level marketing strategy" by Jensen Wang to announce to the world that all the big tech companies leading in large-scale programming languages (LLMs) were lining up and primarily using NVIDIA's infrastructure, further fueling parabolic demand in the stock market and cloud sectors.

Google Unveils Google Pics A Nano Banana-Powered AI Design Tool Seamlessly Built for Workspace.

Source: NVIDIA

💬 AI Content Assistant

Ask me anything about this article. No data is stored for your question.