BotBeat
...
← Back

> ▌

Arc InstituteArc Institute
PRODUCT LAUNCHArc Institute2026-03-05

Arc Institute Unveils Evo 2: AI Model Trained on 9.3 Trillion Base Pairs Can Design Novel Genes and Synthetic Organisms

Key Takeaways

  • ▸Evo 2 was trained on 9.3 trillion base pairs from 128,000+ organisms, representing 30x more data than Evo 1 and the largest genomic AI training dataset to date
  • ▸The model can process up to 1 million base pairs at once and has already been used to design functional bacteriophages and synthetic genomes for mitochondria and yeast
  • ▸Evo 2 identified pathogenic BRCA1 mutations with over 90% accuracy and is being applied to Alzheimer's disease risk prediction and livestock genetics
Source:
Hacker Newshttps://www.dongascience.com/en/news/76660↗

Summary

Researchers at the Arc Institute, in collaboration with NVIDIA, Stanford University, UC Berkeley, and UCSF, have developed Evo 2, a groundbreaking AI foundation model for genetic analysis and design. Published in Nature on March 4, 2025, Evo 2 was trained over several months on 9.3 trillion DNA base pairs extracted from more than 128,000 organisms—including humans, bacteria, plants, and extinct species like mammoths. This represents 30 times more training data than its predecessor, Evo 1, released in 2024, and marks the largest-scale genomic AI model to date.

Evo 2 can process genomes of up to one million base pairs simultaneously, enabling it to understand complex relationships between distant genes. The model has already demonstrated practical applications, including identifying breast cancer-causing BRCA1 mutations with over 90% accuracy, predicting genetic disease risks in Alzheimer's patients, and evaluating genetic variations across livestock species. Most remarkably, researchers have used Evo 2 to design and synthesize functional bacteriophages—viruses that target bacteria—with potential applications in treating antibiotic-resistant infections. The team also successfully designed artificial genomes for mitochondria and yeast.

While Evo 2 represents a significant leap forward, experts caution that creating fully functional artificial organisms remains beyond current capabilities. As Wageningen University professor Nico Claassens noted, even a single missing or incorrectly modeled essential gene could prevent a synthetic genome from functioning. Nevertheless, the model's ability to identify genetic patterns across species promises to dramatically reduce the time and cost of traditional experimental methods, potentially accelerating drug development and disease research. As a foundation model, Evo 2 is expected to serve as a platform for more specialized AI applications, with researchers anticipating creative uses that extend far beyond current imaginations.

  • While promising for accelerating drug development and disease research, experts say fully functional artificial organisms remain beyond current capabilities
  • As a foundation model, Evo 2 is designed to enable researchers to build specialized AI applications for various genomic and biological challenges

Editorial Opinion

Evo 2 represents a quantum leap in computational biology, demonstrating that AI trained on evolutionary patterns can unlock insights that would take decades through traditional methods. The model's ability to design functional synthetic organisms, even at the viral and organellar level, suggests we're approaching a new era where biology becomes programmable. However, the gap between designing 70% of a genome and creating fully viable artificial life underscores an important reality: understanding life's complexity requires more than pattern recognition—it demands comprehension of intricate interdependencies that evolution refined over billions of years. The real promise lies not in replacing nature's designs, but in using AI to decode them faster, potentially revolutionizing how we treat diseases and develop therapeutics.

Large Language Models (LLMs)Generative AIHealthcareScience & ResearchProduct Launch

More from Arc Institute

Arc InstituteArc Institute
RESEARCH

Stanford Researchers Reverse Age-Related Memory Loss by Targeting Gut-Brain Communication

2026-03-12
Arc InstituteArc Institute
PRODUCT LAUNCH

Evo 2: Open-Source AI Trained on Trillions of DNA Bases Can Decode Complex Genomes

2026-03-05
Arc InstituteArc Institute
RESEARCH

AI Models Can Now Generate Entire Genome Sequences, But Synthetic Life Remains Distant

2026-03-05

Comments

Suggested

AnthropicAnthropic
RESEARCH

Inside Claude Code's Dynamic System Prompt Architecture: Anthropic's Complex Context Engineering Revealed

2026-04-05
GitHubGitHub
PRODUCT LAUNCH

GitHub Launches Squad: Open Source Multi-Agent AI Framework to Simplify Complex Workflows

2026-04-05
SourceHutSourceHut
INDUSTRY REPORT

SourceHut's Git Service Disrupted by LLM Crawler Botnets

2026-04-05
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us