Anthropic Donates Petri Alignment Tool to Meridian Labs, Releases Major v3.0 Update

Key Takeaways

▸Petri donated to Meridian Labs for independent, credible development separate from Anthropic
▸Petri v3.0 delivers major improvements in adaptability, realism, and depth of alignment testing
▸Tool has demonstrated real-world adoption and impact, including use by UK AI Security Institute

Source:

X (Twitter)https://www.anthropic.com/research/donating-open-source-petri↗

Summary

Anthropic announced it is donating Petri, its open-source alignment testing tool, to Meridian Labs, an AI evaluation non-profit. The move mirrors Anthropic's previous donation of the Model Context Protocol (MCP) to the Linux Foundation and ensures that Petri remains independent from any single AI lab, enhancing its credibility across the industry.

Simultaneously, Anthropic released Petri version 3.0, a major update developed in collaboration with Meridian Labs. The new version introduces three significant improvements: enhanced adaptability through modular architecture (separating auditor and target models), improved realism via a new "Dish" add-on that uses real system prompts and deployment scaffolds, and greater depth through integration with Anthropic's Bloom alignment tool for more in-depth behavioral assessments.

Petri, which launched in October 2025 as part of the Anthropic Fellows program, has become a key component of Anthropic's alignment evaluation framework, used to assess all Claude models since Claude Sonnet 4.5. The tool enables rapid testing for concerning tendencies like deception, sycophancy, and susceptibility to harmful requests. It has already gained traction with external organizations, including the UK's AI Security Institute (AISI), which integrated Petri into their model evaluation processes.

Reflects Anthropic's commitment to open-source AI safety infrastructure alongside production models

Editorial Opinion

Anthropic's decision to donate Petri to Meridian Labs is a strategic move that prioritizes the credibility of alignment evaluation over institutional control. By releasing v3.0 with substantial improvements before the transition, Anthropic demonstrates genuine commitment to the tool's long-term success rather than using the donation as a way to simply offload maintenance. This approach strengthens the case for why major AI labs should invest in open-source alignment infrastructure—if done authentically, it elevates the entire field's ability to evaluate AI behavior rigorously and independently.

Anthropic Donates Petri Alignment Tool to Meridian Labs, Releases Major v3.0 Update

Key Takeaways

▸Petri donated to Meridian Labs for independent, credible development separate from Anthropic
▸Petri v3.0 delivers major improvements in adaptability, realism, and depth of alignment testing
▸Tool has demonstrated real-world adoption and impact, including use by UK AI Security Institute

Summary

Reflects Anthropic's commitment to open-source AI safety infrastructure alongside production models

Editorial Opinion

Anthropic's decision to donate Petri to Meridian Labs is a strategic move that prioritizes the credibility of alignment evaluation over institutional control. By releasing v3.0 with substantial improvements before the transition, Anthropic demonstrates genuine commitment to the tool's long-term success rather than using the donation as a way to simply offload maintenance. This approach strengthens the case for why major AI labs should invest in open-source alignment infrastructure—if done authentically, it elevates the entire field's ability to evaluate AI behavior rigorously and independently.

Anthropic Donates Petri Alignment Tool to Meridian Labs, Releases Major v3.0 Update

Key Takeaways

Summary

Editorial Opinion

More from Anthropic

Anthropic Releases Prempti: Open-Source Guardrails for AI Coding Agents

Anthropic Unleashes Computer Use: Claude 3.5 Sonnet Now Controls Your Desktop

SpaceX Backs Anthropic with Massive Data Centre Deal Amidst Musk's OpenAI Legal Battle

Comments

Suggested

Anthropic Releases Prempti: Open-Source Guardrails for AI Coding Agents

mm-ctx: Open-Source Multimodal CLI Toolkit Brings Vision Capabilities to AI Agents

Anthropic Unleashes Computer Use: Claude 3.5 Sonnet Now Controls Your Desktop

Anthropic Donates Petri Alignment Tool to Meridian Labs, Releases Major v3.0 Update

Key Takeaways

Summary

Editorial Opinion

More from Anthropic

Anthropic Releases Prempti: Open-Source Guardrails for AI Coding Agents

Anthropic Unleashes Computer Use: Claude 3.5 Sonnet Now Controls Your Desktop

SpaceX Backs Anthropic with Massive Data Centre Deal Amidst Musk's OpenAI Legal Battle

Comments

Suggested

Anthropic Releases Prempti: Open-Source Guardrails for AI Coding Agents

mm-ctx: Open-Source Multimodal CLI Toolkit Brings Vision Capabilities to AI Agents

Anthropic Unleashes Computer Use: Claude 3.5 Sonnet Now Controls Your Desktop