- Today
- Holidays
- Birthdays
- Reminders
- Cities
- Atlanta
- Austin
- Baltimore
- Berwyn
- Beverly Hills
- Birmingham
- Boston
- Brooklyn
- Buffalo
- Charlotte
- Chicago
- Cincinnati
- Cleveland
- Columbus
- Dallas
- Denver
- Detroit
- Fort Worth
- Houston
- Indianapolis
- Knoxville
- Las Vegas
- Los Angeles
- Louisville
- Madison
- Memphis
- Miami
- Milwaukee
- Minneapolis
- Nashville
- New Orleans
- New York
- Omaha
- Orlando
- Philadelphia
- Phoenix
- Pittsburgh
- Portland
- Raleigh
- Richmond
- Rutherford
- Sacramento
- Salt Lake City
- San Antonio
- San Diego
- San Francisco
- San Jose
- Seattle
- Tampa
- Tucson
- Washington
Milpitas Today
By the People, for the People
AIC and ScaleFlux Deliver Optimized Hardware Platform for AI Context Memory Storage
Joint solution addresses growing KV-cache and long-context inference workloads with a purpose-built CMX infrastructure tier.
Mar. 13, 2026 at 7:00pm
Got story updates? Submit your updates here. ›
As AI models evolve to support longer prompts, multi-turn conversations, and autonomous agents, the amount of memory required to store inference context has expanded dramatically. ScaleFlux and AIC are delivering a joint hardware platform designed to accelerate emerging Inference Context Memory Storage (ICMS or CMX) deployments for large-scale AI inference infrastructure. The platform combines the AIC F2032-G6 JBOF Storage System with ScaleFlux NVMe SSDs and NVIDIA's latest data-center networking technologies, providing a purpose-built hardware solution optimized for the rapidly growing context memory storage tier in modern AI clusters.
Why it matters
The growing need for context memory storage in AI inference workloads is driven by the shift from stateless queries to persistent, long-context interactions. This new hardware platform helps AI infrastructure operators address key challenges like expanding KV-cache requirements, efficient offloading of context memory from GPUs, and providing high-performance shared storage to serve context data at scale.
The details
The AIC F2032-G6 JBOF platform provides a high-density NVMe storage system integrated with BlueField-4 DPUs and/or ConnectX-9 SuperNICs to deliver high-throughput, low-latency connectivity between GPU servers and shared context memory storage. When populated with ScaleFlux NVMe SSDs, the system delivers a powerful and efficient hardware configuration for CMX deployments, with ScaleFlux SSD technology designed to sustain high-IOPS, low-latency data access patterns typical of KV-cache workloads.
- The joint platform was announced on March 13, 2026.
The players
AIC
A global leader in server and storage solutions, with expertise in high-density storage servers, storage server barebones, and high-performance computers.
ScaleFlux
A fabless semiconductor company that has developed a holistic approach to storage and memory innovation, combining hardware and software to unlock performance, efficiency, security, and scalability for data-intensive applications.
NVIDIA
A technology company that provides advanced data-center networking technologies, including the BlueField-4 DPU and ConnectX-9 SuperNIC, which are integrated into the joint hardware platform.
What they’re saying
“AI inference is rapidly shifting from stateless queries to persistent, long-context interactions. Our new F2032-G6 platform, combined with BlueField-4 and ConnectX-9 networking, provides the high-performance storage architecture needed to support context memory storage at scale.”
— Michael Liang, CEO at AIC
“Context memory is emerging as a new data tier in AI infrastructure. By pairing ScaleFlux NVMe SSDs with AIC's high-density JBOF platform and NVIDIA's advanced data-center networking technologies, we are delivering a hardware solution optimized for the next generation of AI inference pipelines.”
— Hao Zhong, CEO and Co-Founder at ScaleFlux
What’s next
The joint hardware platform is expected to be widely adopted by AI infrastructure operators as the need for scalable context memory storage continues to grow to support increasingly sophisticated AI services.
The takeaway
This new hardware platform addresses the evolving needs of AI inference workloads, providing a purpose-built solution for the rapidly growing context memory storage tier. By combining high-performance storage, advanced networking, and optimized hardware, the platform helps maximize GPU utilization and support the next generation of long-context AI applications.

