- Today
- Holidays
- Birthdays
- Reminders
- Cities
- Atlanta
- Austin
- Baltimore
- Berwyn
- Beverly Hills
- Birmingham
- Boston
- Brooklyn
- Buffalo
- Charlotte
- Chicago
- Cincinnati
- Cleveland
- Columbus
- Dallas
- Denver
- Detroit
- Fort Worth
- Houston
- Indianapolis
- Knoxville
- Las Vegas
- Los Angeles
- Louisville
- Madison
- Memphis
- Miami
- Milwaukee
- Minneapolis
- Nashville
- New Orleans
- New York
- Omaha
- Orlando
- Philadelphia
- Phoenix
- Pittsburgh
- Portland
- Raleigh
- Richmond
- Rutherford
- Sacramento
- Salt Lake City
- San Antonio
- San Diego
- San Francisco
- San Jose
- Seattle
- Tampa
- Tucson
- Washington
Nvidia Debuts Groq 3 Language Processing Unit for Multiagent Workloads
The new chip is designed to accelerate AI inference processing beyond what GPUs can handle.
Mar. 17, 2026 at 12:55am
Got story updates? Submit your updates here. ›
Nvidia announced the Groq 3 language processing unit (LPU), a dedicated inference chip for running AI models in multiagent systems. The Groq 3 LPU is designed to work alongside Nvidia's new Rubin GPUs and Vera CPUs to deliver high-throughput, low-latency performance for large language models and agentic systems that automate work on behalf of humans.
Why it matters
The Groq 3 LPU represents Nvidia's push to expand its data center footprint by providing specialized hardware optimized for the growing demands of AI inference workloads. As the industry moves towards more complex, multi-agent systems powered by massive language models, Nvidia is positioning its new chip and server racks to handle the high throughput and low latency requirements of these workloads.
The details
The Groq 3 LPU is designed to offer faster memory and higher bandwidth than Nvidia's GPUs, enabling it to accelerate inference processing. It will be available in dedicated Groq 3 LPX server racks that include 256 Groq 3 LPUs, 128GB of SSRAM, and 40 petabytes-per-second of bandwidth. The Groq 3 LPX is meant to work in tandem with Nvidia's new Vera Rubin NVL72 rack, which integrates both Rubin GPUs and Vera CPUs. Together, the two systems are designed to handle trillion-parameter models and million-token context, delivering 35 times higher throughput per megawatt and 10 times greater revenue opportunity.
- Nvidia first announced the Groq acquisition and hiring of the startup's founders in December 2025.
- The Groq 3 LPU was announced at Nvidia's GTC 2026 developer conference in San Jose, California on March 16, 2026.
The players
Nvidia
An American multinational technology company that designs graphics processing units (GPUs) for the gaming and professional markets, as well as system on a chip units (SoCs) for the mobile computing and automotive market.
Groq Inc.
A startup that Nvidia acquired in a $20 billion deal in December 2025, known for developing processors focused on AI inference rather than training.
Jonathan Ross
The founder of Groq Inc., who joined Nvidia as part of the acquisition.
Sunny Madra
The president of Groq Inc., who also joined Nvidia as part of the acquisition.
Ian Buck
The vice president of hyperscale and high-performance computing at Nvidia.
What they’re saying
“While the company's GPUs offer greater memory, Groq 3's memory is much faster. It's designed to support low-latency workloads and the large context demands of agentic systems that automate work on behalf of humans.”
— Ian Buck, Vice President of Hyperscale and High-Performance Computing (SiliconANGLE)
“We're moving towards a reality where multi-agent systems are continually communicating with one another, which means they need to be much more responsive. While 100 tokens-per-second might seem reasonable for humans, such speeds would seem glacial for agentic systems.”
— Ian Buck, Vice President of Hyperscale and High-Performance Computing (SiliconANGLE)
What’s next
Nvidia plans to make the Groq 3 LPX server racks and Vera Rubin NVL72 racks available to customers in the coming months, as part of its push to expand its data center footprint and capture a larger share of the growing demand for powerful AI computing infrastructure.
The takeaway
Nvidia's new Groq 3 language processing unit and accompanying server racks represent the company's strategy to provide specialized hardware optimized for the demands of large-scale, multi-agent AI systems. By combining the Groq 3 LPU's high-throughput, low-latency inference capabilities with its Rubin GPUs and Vera CPUs, Nvidia aims to deliver a comprehensive solution for the next generation of AI-powered automation and intelligent systems.
San Jose top stories
San Jose events
Mar. 18, 2026
Dinastia Tour by Peso Pluma, Tito Double P & FriendsMar. 18, 2026
David Nihill: Taking Tangents TourMar. 18, 2026
San Jose Barracuda vs. Coachella Valley Firebirds




