Evo 2 AI Models Genetic Code for All Life Domains

The largest open-source AI model in biology can identify disease-causing mutations and design new genomes.

Published on Mar. 5, 2026

The DNA foundation model Evo 2, trained on over 100,000 species across the entire tree of life, can accurately identify disease-causing mutations in human genes and design new genomes as long as simple bacteria. Developed by scientists from Arc Institute and NVIDIA, Evo 2 is the largest artificial intelligence model in biology to date, trained on over 9.3 trillion nucleotides from over 128,000 whole genomes. The model's code is publicly accessible, making it the largest-scale, fully open source AI model to date.

Why it matters

Evo 2 represents a key moment in the emerging field of generative biology, enabling machines to read, write, and think in the language of nucleotides. The model's generalist understanding of the tree of life is useful for a multitude of tasks, from predicting disease-causing mutations to designing potential code for artificial life. This could save countless hours and research dollars needed to run cell or animal experiments, by finding genetic causes of human diseases and accelerating the development of new medicines.

The details

Evo 2 was trained for several months on the NVIDIA DGX Cloud AI platform via AWS, utilizing over 2,000 NVIDIA H100 GPUs. The model can process genetic sequences of up to 1 million nucleotides at once, enabling it to understand relationships between distant parts of a genome. In tests with variants of the breast cancer-associated gene BRCA1, Evo 2 achieved over 90% accuracy in predicting which mutations are benign versus potentially pathogenic. Researchers have also used Evo 2 to design functional synthetic bacteriophages, demonstrating potential applications for treating antibiotic-resistant bacteria.

  • Evo 2 was first released as a preprint in February 2025.
  • Evo 2 is now published in the journal Nature.

The players

Arc Institute

A research institute that collaborated with NVIDIA to develop the Evo 2 AI model.

NVIDIA

A technology company that collaborated with Arc Institute to develop the Evo 2 AI model.

Patrick Hsu

Arc Institute Co-Founder, Arc Core Investigator, and Assistant Professor of Bioengineering and Deb Faculty Fellow at the University of California, Berkeley.

Brian Hie

Assistant Professor of Chemical Engineering at Stanford University, the Dieter Schwarz Foundation Stanford Data Science Faculty Fellow, and Arc Institute Innovation Investigator in Residence.

Hani Goodarzi

Arc Core Investigator and Associate Professor of Biochemistry and Biophysics at the University of California, San Francisco.

Got photos? Submit your photos here. ›

What they’re saying

“Our development of Evo 1 and Evo 2 represents a key moment in the emerging field of generative biology, as the models have enabled machines to read, write, and think in the language of nucleotides.”

— Patrick Hsu, Arc Institute Co-Founder, Arc Core Investigator, and Assistant Professor of Bioengineering and Deb Faculty Fellow at the University of California, Berkeley (Nature)

“Just as the world has left its imprint on the language of the Internet used to train large language models, evolution has left its imprint on biological sequences. These patterns, refined over millions of years, contain signals about how molecules work and interact.”

— Brian Hie, Assistant Professor of Chemical Engineering at Stanford University, the Dieter Schwarz Foundation Stanford Data Science Faculty Fellow, and Arc Institute Innovation Investigator in Residence (Nature)

“If you have a gene therapy that you want to turn on only in neurons to avoid side effects, or only in liver cells, you could design a genetic element that is only accessible in those specific cells. This precise control could help develop more targeted treatments with fewer side effects.”

— Hani Goodarzi, Arc Core Investigator and Associate Professor of Biochemistry and Biophysics at the University of California, San Francisco (Nature)

What’s next

The research team envisions that more specific AI models could be built with Evo 2 as a foundation, with potential applications ranging from predicting how single DNA mutations affect a protein's function to designing genetic elements that behave differently in different cell types.

The takeaway

Evo 2 represents a significant advancement in the field of generative biology, enabling machines to read, write, and think in the language of DNA and RNA. This powerful AI model has the potential to accelerate scientific research, drive the development of new treatments and therapies, and unlock new frontiers in the understanding of biological systems.