Basecamp Research Launches Trillion Gene Atlas to Scale AI-Designed Therapeutics

The Atlas will expand known evolutionary genetic diversity by 100x, collecting novel genomic data from over 100 million new species across thousands of sites globally.

Mar. 18, 2026 at 10:13am

Basecamp Research, a frontier AI lab for biological design, has announced the launch of the Trillion Gene Atlas, a scientific initiative to generate and model biological data at the trillion-gene scale. Launched in collaboration with Anthropic, Ultima Genomics and PacBio, and powered by NVIDIA AI infrastructure, the Trillion Gene Atlas aims to expand known evolutionary genetic diversity 100-fold by collecting genomic data from more than 100 million species across thousands of sites worldwide.

Why it matters

The initiative, which is on the scale of the Human Genome Project, aims to provide the vast, diverse training data required for AI systems to learn from evolution and design new medicines on demand. Current biological AI models are limited by the narrow slice of life on Earth represented in public databases, so expanding the known genetic universe is critical for progress in AI drug development.

The details

Basecamp Research's EDEN foundation models, released in January, have already shown the potential of learning from a much larger proprietary genomic database. By training on 10 billion new-to-science genes across 1 million newly discovered species, EDEN unlocked critical new scaling laws for AI in biology, moving beyond simple prediction to directly designing diverse therapeutics. The Trillion Gene Atlas builds on this approach by greatly expanding the breadth and contextual depth of genomic data suitable for AI training.

  • The Trillion Gene Atlas initiative was announced on March 18, 2026 at SXSW in Austin and the NVIDIA GTC conference in San Jose.
  • Basecamp Research plans to compress over two decades of biological data gathering and analysis into less than two years through the Trillion Gene Atlas project.

The players

Basecamp Research

A frontier AI lab for biological design that is leading the Trillion Gene Atlas initiative.

Anthropic

A partner in the Trillion Gene Atlas initiative, working to add new capabilities for life sciences and connect its Claude AI platform to more scientific platforms.

Ultima Genomics

A developer of ultra-high throughput next-generation sequencing (NGS) systems that is providing industrial-scale sequencing for the Trillion Gene Atlas.

PacBio

A provider of highly accurate long-read sequencing technology that is enabling the Trillion Gene Atlas to preserve full genomic context and enable subspecies-level resolution.

NVIDIA

The provider of accelerated computing infrastructure that will power the processing of vast quantities of genetic data for the Trillion Gene Atlas.

Got photos? Submit your photos here. ›

What they’re saying

“Today's biological AI models are trained on a narrow slice of life on Earth. The Trillion Gene Atlas expands the known genetic universe by orders of magnitude beyond what is in public databases. Training models at this scale establishes a new paradigm for programmable therapeutic design.”

— Glen Gowers, Co-founder and CEO of Basecamp Research (SXSW)

“Bigger models alone aren't enough. EDEN showed that performance in biological AI follows much steeper scaling trajectories with higher quality and fully contextualized data. The Trillion Gene Atlas extends that principle 100-fold.”

— Phil Lorenz, CTO of Basecamp Research (SXSW)

“Biology has been fundamentally data-starved when compared to other fields like language or computer vision as researchers have lacked the tools required to generate data at scale. We strongly believe that AI will have an immense impact on our understanding of biology and human health, and the UG200 Series was designed from the ground up to enable the massive datasets required for BioAI to deliver on this promise.”

— Gilad Almogy, Founder and CEO of Ultima Genomics (SXSW)

“PacBio HiFi sequencing delivers highly accurate long reads that preserve full genomic context and enables subspecies and even strain-level resolution in complex samples. HiFi data provides the reliable, information-rich foundation biological AI models need to learn from nature at scale and power initiatives like the Trillion Gene Atlas.”

— Christian Henry, President and CEO of PacBio (SXSW)

What’s next

Through parallelized data processing, automated annotation, and large-scale model training, the partners expect to compress a task that previously would have required more than 20 years of processing time to less than two years. This compression of sequencing, assembly, annotation and model training is intended to expand the performance and scope of biological foundation models across therapeutic development.

The takeaway

The Trillion Gene Atlas represents a landmark scientific initiative to dramatically expand the known genetic diversity available to train AI systems for therapeutic design. By partnering with leading technology companies and leveraging advances in sequencing and computing, Basecamp Research aims to create a new paradigm for programmable drug discovery that can learn directly from the diversity of life on Earth.