BotBeat
...
← Back

> ▌

Arc InstituteArc Institute
RESEARCHArc Institute2026-03-04

Evo 2: Foundation Model Trained on 9 Trillion DNA Base Pairs Advances Genome Design Across All Life

Key Takeaways

  • ▸Evo 2 was trained on 9 trillion DNA base pairs with a 1 million token context window, representing the largest genomic foundation model to date
  • ▸The model accurately predicts functional impacts of genetic variations, including pathogenic mutations and BRCA1 variants, without task-specific fine-tuning
  • ▸Evo 2 generates biologically coherent sequences at genome scale across mitochondrial, prokaryotic, and eukaryotic domains
Source:
Hacker Newshttps://www.nature.com/articles/s41586-026-10176-5↗

Summary

Researchers have unveiled Evo 2, a biological foundation model trained on an unprecedented 9 trillion DNA base pairs from a curated genomic atlas spanning all domains of life. Published in Nature, the model features a 1 million token context window with single-nucleotide resolution, enabling it to predict functional impacts of genetic variations—from noncoding pathogenic mutations to clinically significant BRCA1 variants—without requiring task-specific fine-tuning. The system demonstrates sophisticated understanding of biological features including exon-intron boundaries, transcription factor binding sites, protein structural elements, and prophage genomic regions.

Evo 2's generative capabilities represent a significant advance in computational biology, producing mitochondrial, prokaryotic, and eukaryotic sequences at genome scale with superior naturalness and coherence compared to previous methods. The model can generate experimentally validated chromatin accessibility patterns when combined with predictive models and inference-time search techniques. Mechanistic interpretability analyses reveal that Evo 2 has learned biologically meaningful representations across multiple scales of genomic organization.

In a notable commitment to open science, the research team has made Evo 2 fully open-source, including model parameters, training code, inference code, and the OpenGenome2 dataset. This comprehensive release aims to accelerate exploration and design of biological complexity across the scientific community. The model's ability to work across all domains of life without fine-tuning positions it as a potential universal tool for genomic research and synthetic biology applications.

  • The entire system has been released as open source, including model weights, training code, and the OpenGenome2 dataset

Editorial Opinion

Evo 2 represents a watershed moment in computational biology, demonstrating that foundation models can capture the fundamental principles of genomic organization across all life. The 1 million token context window is particularly impressive, enabling the model to reason about long-range genomic interactions that previous models couldn't address. The decision to make everything fully open-source—including the massive OpenGenome2 dataset—sets a powerful precedent for responsible AI development in biology and could dramatically accelerate both basic research and therapeutic applications.

Large Language Models (LLMs)Generative AIMachine LearningHealthcareScience & ResearchOpen Source

More from Arc Institute

Arc InstituteArc Institute
RESEARCH

Stanford Researchers Reverse Age-Related Memory Loss by Targeting Gut-Brain Communication

2026-03-12
Arc InstituteArc Institute
PRODUCT LAUNCH

Evo 2: Open-Source AI Trained on Trillions of DNA Bases Can Decode Complex Genomes

2026-03-05
Arc InstituteArc Institute
RESEARCH

AI Models Can Now Generate Entire Genome Sequences, But Synthetic Life Remains Distant

2026-03-05

Comments

Suggested

AnthropicAnthropic
RESEARCH

Inside Claude Code's Dynamic System Prompt Architecture: Anthropic's Complex Context Engineering Revealed

2026-04-05
GitHubGitHub
PRODUCT LAUNCH

GitHub Launches Squad: Open Source Multi-Agent AI Framework to Simplify Complex Workflows

2026-04-05
SourceHutSourceHut
INDUSTRY REPORT

SourceHut's Git Service Disrupted by LLM Crawler Botnets

2026-04-05
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us