AWS and Cerebras Collaboration Aims to Set a New Standard for AI Inference Speed and Performance in the Cloud

Deployed in AWS data centers and accessed through Amazon Bedrock, AWS Trainium + Cerebras CS-3 solution will accelerate inference speed

Mar. 13, 2026 at 7:05pm

Got story updates? Submit your updates here. ›

Amazon Web Services (AWS) and Cerebras Systems have announced a collaboration that will deliver the fastest AI inference solutions available for generative AI applications and large language model (LLM) workloads. The solution, to be deployed on Amazon Bedrock in AWS data centers, combines AWS Trainium-powered servers, Cerebras CS-3 systems, and Elastic Fabric Adapter (EFA) networking. This innovative integrated system will provide unmatched performance and speed for AI inference by splitting the inference workload across Trainium and CS-3, with each system optimized for its specific computational needs.

Why it matters

Inference is a critical bottleneck for demanding AI workloads like real-time coding assistance and interactive applications. The collaboration between AWS and Cerebras aims to solve this issue by delivering the fastest AI inference solutions available, which will enable enterprises around the world to benefit from blisteringly fast inference within their existing AWS environment.

The details

The Trainium + CS-3 solution enables 'inference disaggregation,' a technique which separates AI inference into two stages: prompt processing, or 'prefill,' and output generation, or 'decode.' Trainium is optimized for the prefill stage, which is natively parallel and computationally intensive, while the Cerebras CS-3 is optimized for the decode stage, which is inherently serial and memory bandwidth intensive. By strategically disaggregating the inference problem, the two different computational challenges can be optimized in a specialized way, resulting in significantly faster inference performance.

The new solution will be launched in the coming months and made available through Amazon Bedrock.
Later this year, AWS will also offer leading open-source LLMs and Amazon Nova using Cerebras hardware.

The players

Amazon Web Services (AWS)

An Amazon.com, Inc. company that provides on-demand cloud computing platforms and APIs to individuals, companies, and governments, on a metered pay-as-you-go basis.

Cerebras Systems

A company that builds the fastest AI infrastructure in the world, including the Wafer Scale Engine 3 (WSE-3), the world's largest and fastest AI processor.

David Brown

Vice President, Compute & ML Services at AWS.

Andrew Feldman

Founder and CEO of Cerebras Systems.

Got photos? Submit your photos here. ›

What they’re saying

“Inference is where AI delivers real value to customers, but speed remains a critical bottleneck for demanding workloads like real-time coding assistance and interactive applications. What we're building with Cerebras solves that: by splitting the inference workload across Trainium and CS-3, and connecting them with Amazon's Elastic Fabric Adapter, each system does what it's best at. The result will be inference that's an order of magnitude faster and higher performance than what's available today.”

— David Brown, Vice President, Compute & ML Services, AWS

“Partnering with AWS to build a disaggregated inference solution will bring the fastest inference to a global customer base. Every enterprise around the world will be able to benefit from blisteringly fast inference within their existing AWS environment.”

— Andrew Feldman, Founder and CEO of Cerebras Systems

What’s next

Later this year, AWS will also offer leading open-source LLMs and Amazon Nova using Cerebras hardware.

The takeaway

The collaboration between AWS and Cerebras aims to set a new standard for AI inference speed and performance in the cloud, addressing a critical bottleneck for demanding AI workloads and enabling enterprises worldwide to benefit from blazingly fast inference within their existing AWS environment.

AWS and Cerebras Collaboration Aims to Set a New Standard for AI Inference Speed and Performance in the Cloud

Why it matters

The details

The players

Amazon Web Services (AWS)

Cerebras Systems

David Brown

Andrew Feldman

What they’re saying

What’s next

The takeaway

Sunnyvale top stories

Sunnyvale Tech

About us

Resources

Contact Us

Our Services

Months

Upcoming

All Months

Gifts

Blog

Shopping Reviews

Gift Guides

Popular Holidays

About National Today