- Today
- Holidays
- Birthdays
- Reminders
- Cities
- Atlanta
- Austin
- Baltimore
- Berwyn
- Beverly Hills
- Birmingham
- Boston
- Brooklyn
- Buffalo
- Charlotte
- Chicago
- Cincinnati
- Cleveland
- Columbus
- Dallas
- Denver
- Detroit
- Fort Worth
- Houston
- Indianapolis
- Knoxville
- Las Vegas
- Los Angeles
- Louisville
- Madison
- Memphis
- Miami
- Milwaukee
- Minneapolis
- Nashville
- New Orleans
- New York
- Omaha
- Orlando
- Philadelphia
- Phoenix
- Pittsburgh
- Portland
- Raleigh
- Richmond
- Rutherford
- Sacramento
- Salt Lake City
- San Antonio
- San Diego
- San Francisco
- San Jose
- Seattle
- Tampa
- Tucson
- Washington
Evo 2 AI Models Genetic Code for All Life Domains
The largest open-source AI model in biology can identify disease-causing mutations and design new genomes.
Published on Mar. 5, 2026
Got story updates? Submit your updates here. ›
The DNA foundation model Evo 2, trained on over 100,000 species across the entire tree of life, can accurately identify disease-causing mutations in human genes and design new genomes as long as simple bacteria. Developed by scientists from Arc Institute and NVIDIA, Evo 2 is the largest artificial intelligence model in biology to date, trained on over 9.3 trillion nucleotides from over 128,000 whole genomes. The model's code is publicly accessible, making it the largest-scale, fully open source AI model to date.
Why it matters
Evo 2 represents a key moment in the emerging field of generative biology, enabling machines to read, write, and think in the language of nucleotides. The model's generalist understanding of the tree of life is useful for a multitude of tasks, from predicting disease-causing mutations to designing potential code for artificial life. This could save countless hours and research dollars needed to run cell or animal experiments, by finding genetic causes of human diseases and accelerating the development of new medicines.
The details
Evo 2 was trained for several months on the NVIDIA DGX Cloud AI platform via AWS, utilizing over 2,000 NVIDIA H100 GPUs. The model can process genetic sequences of up to 1 million nucleotides at once, enabling it to understand relationships between distant parts of a genome. In tests with variants of the breast cancer-associated gene BRCA1, Evo 2 achieved over 90% accuracy in predicting which mutations are benign versus potentially pathogenic. Researchers have also used Evo 2 to design functional synthetic bacteriophages, demonstrating potential applications for treating antibiotic-resistant bacteria.
- Evo 2 was first released as a preprint in February 2025.
- Evo 2 is now published in the journal Nature.
The players
Arc Institute
A research institute that collaborated with NVIDIA to develop the Evo 2 AI model.
NVIDIA
A technology company that collaborated with Arc Institute to develop the Evo 2 AI model.
Patrick Hsu
Arc Institute Co-Founder, Arc Core Investigator, and Assistant Professor of Bioengineering and Deb Faculty Fellow at the University of California, Berkeley.
Brian Hie
Assistant Professor of Chemical Engineering at Stanford University, the Dieter Schwarz Foundation Stanford Data Science Faculty Fellow, and Arc Institute Innovation Investigator in Residence.
Hani Goodarzi
Arc Core Investigator and Associate Professor of Biochemistry and Biophysics at the University of California, San Francisco.
What they’re saying
“Our development of Evo 1 and Evo 2 represents a key moment in the emerging field of generative biology, as the models have enabled machines to read, write, and think in the language of nucleotides.”
— Patrick Hsu, Arc Institute Co-Founder, Arc Core Investigator, and Assistant Professor of Bioengineering and Deb Faculty Fellow at the University of California, Berkeley (Nature)
“Just as the world has left its imprint on the language of the Internet used to train large language models, evolution has left its imprint on biological sequences. These patterns, refined over millions of years, contain signals about how molecules work and interact.”
— Brian Hie, Assistant Professor of Chemical Engineering at Stanford University, the Dieter Schwarz Foundation Stanford Data Science Faculty Fellow, and Arc Institute Innovation Investigator in Residence (Nature)
“If you have a gene therapy that you want to turn on only in neurons to avoid side effects, or only in liver cells, you could design a genetic element that is only accessible in those specific cells. This precise control could help develop more targeted treatments with fewer side effects.”
— Hani Goodarzi, Arc Core Investigator and Associate Professor of Biochemistry and Biophysics at the University of California, San Francisco (Nature)
What’s next
The research team envisions that more specific AI models could be built with Evo 2 as a foundation, with potential applications ranging from predicting how single DNA mutations affect a protein's function to designing genetic elements that behave differently in different cell types.
The takeaway
Evo 2 represents a significant advancement in the field of generative biology, enabling machines to read, write, and think in the language of DNA and RNA. This powerful AI model has the potential to accelerate scientific research, drive the development of new treatments and therapies, and unlock new frontiers in the understanding of biological systems.


