EverMind Unveils MSA Architecture for 100M-Token Long-Term Memory in LLMs

Novel memory architecture achieves efficient, end-to-end long-term memory scaling to unprecedented 100 million tokens.

Mar. 19, 2026 at 5:33am

EverMind, an AI memory infrastructure pioneer, has released a landmark research paper introducing the Memory Sparse Attention (MSA) architecture. MSA enables large language models to achieve efficient, end-to-end long-term memory at the scale of 100 million tokens, a significant breakthrough in addressing the long-standing challenge of lifelong memory retention in LLMs.

Why it matters

This work stands as a potential milestone in ushering in a new era of "Memory-as-a-Service" for the AI ecosystem, where memory can act as an independent, pluggable service freely combined with various reasoning cores. It paints an exciting blueprint for the future development of AI, where user data and "memory assets" will no longer be locked into any single model or vendor.

The details

The MSA architecture combines several key innovations, including the Memory Sparse Attention mechanism, Document-wise RoPE for extreme context extrapolation, KV Cache Compression with Memory Parallelism, and a Memory Interleave mechanism supporting complex reasoning. This cohesive stack of architectural advancements enables MSA to achieve linear complexity while maintaining high precision, shattering the "Impossible Triangle" of trade-offs faced by previous approaches to long-term memory in LLMs.

  • On March 18, 2026, EverMind released the landmark research paper on MSA.
  • The MSA paper is published on Zenodo and open-sourced on GitHub.

The players

EverMind

A pioneer in AI memory infrastructure, the team behind the MSA architecture and part of Shanda Group's "Discoverative AI" vision.

Shanda Group

The parent company of EverMind, focused on building a "Discoverative AI" ecosystem with a long-term strategy centered on independent, autonomous, and controllable AI infrastructure.

Tianqiao Chen

The founder of Shanda Group, who has deeply incubated the EverMind team and articulated Shanda's "Discoverative AI" vision in recent interviews.

Got photos? Submit your photos here. ›

The takeaway

The MSA architecture represents a significant breakthrough in addressing the long-standing challenge of lifelong memory retention in LLMs, paving the way for a new era of "Memory-as-a-Service" in the AI ecosystem. By providing an efficient, scalable, and high-precision memory framework, MSA enables LLMs to transcend the limitations of traditional approaches and unlock new possibilities for complex reasoning, knowledge discovery, and self-evolving AI systems.