Google Cloud and NVIDIA Expand Partnership to Power Next-Gen AI Workloads

Unveiling co-engineered AI infrastructure solutions at NVIDIA GTC 2026

Mar. 16, 2026 at 11:00pm

At NVIDIA GTC 2026, Google Cloud and NVIDIA announced a deepened partnership to deliver advanced AI infrastructure solutions, including the expansion of Google Cloud's G4 VM offerings powered by NVIDIA RTX Pro 6000 Server Edition GPUs, the introduction of fractional G4 VMs leveraging NVIDIA vGPU technology, and upcoming support for the NVIDIA Vera Rubin NVL72 platform. The companies also showcased integrations between Dynamo and GKE Inference Gateway, as well as enhancements to Vertex AI training and Model Garden to support the demands of next-generation agentic AI workloads.

Why it matters

As enterprises increasingly adopt agentic AI systems capable of dynamic reasoning and autonomous execution, the underlying infrastructure must evolve to meet the demands of these advanced workloads. The collaboration between Google Cloud and NVIDIA aims to provide a deeply optimized, co-engineered stack to power the scale, resilience, and efficiency required for the next era of AI.

The details

The key announcements include the expansion of Google Cloud's G4 VM offerings powered by NVIDIA RTX Pro 6000 Server Edition GPUs, which are optimized for a diverse spectrum of high-performance workloads, from advanced spatial computing to AI development lifecycles. Google Cloud is also introducing fractional G4 VMs, which leverage NVIDIA virtual GPU (vGPU) technology to provide more granular access to the powerful NVIDIA RTX Pro 6000 Blackwell Server Edition GPUs. Additionally, Google Cloud plans to be among the first cloud providers to offer NVIDIA's Vera Rubin NVL72 rack-scale systems in the second half of 2026, integrating them into the company's AI Hypercomputer architecture.

  • The fractional G4 VMs are currently in preview.
  • Google Cloud plans to offer NVIDIA Vera Rubin NVL72 rack-scale systems in the second half of 2026.

The players

Google Cloud

A leading cloud computing platform that provides a range of infrastructure, platform, and software services to enterprises and developers.

NVIDIA

A multinational technology company known for its graphics processing units (GPUs) and AI computing solutions.

Otto Group One.O

A customer of Google Cloud's G4 VMs, using them to run physically accurate simulations and real-time 3D rendering at scale.

WPP

A customer of Google Cloud's G4 VMs, using them to run physically accurate simulations and real-time 3D rendering at scale.

ElevenLabs

A customer of Google Cloud's G4 VMs, using them to power their multimodal AI models for voice agents at enterprise scale.

Got photos? Submit your photos here. ›

What they’re saying

“Google Cloud's G4 VMs give us the scalable GPU backbone we need to push billions of miles of photorealistic simulation through our pipeline. The 4x lift in throughput means our ML teams can iterate faster, train on richer data, and validate edge cases long before our models ever see the real world.”

— Sony Mohapatra, Director, AI/ML Engineering (General Motors)

“Now with G4 VMs powered by NVIDIA Blackwell, we're pushing our multimodal models even further — faster inference, better reliability, instant replies across languages. The goal stays the same: making voice agents that work at enterprise scale without compromise. We are excited to keep building together and see what our customers deploy with this.”

— Mati Staniszewski, Cofounder (ElevenLabs)

What’s next

Google Cloud and NVIDIA plan to offer the NVIDIA Vera Rubin NVL72 rack-scale systems in the second half of 2026, integrating them into Google Cloud's AI Hypercomputer architecture.

The takeaway

The deepened partnership between Google Cloud and NVIDIA showcases their commitment to providing a co-engineered, optimized infrastructure stack to power the next generation of agentic AI workloads, empowering enterprises to scale their most ambitious AI initiatives.