BotBeat
...
← Back

> ▌

Multiple AI CompaniesMultiple AI Companies
RESEARCHMultiple AI Companies2026-05-08

Phishing Arena: Multi-Agent Security Benchmark Reveals Contextual Plausibility as Primary Phishing Threat Vector

Key Takeaways

  • ▸Contextual plausibility—not technical evasion—drives 79% of successful phishing attacks, revealing fundamental vulnerability in how LLMs process socially engineered emails
  • ▸OpenAI's GPT-5.4-mini achieves highest phishing capability (12.9% bypass rate with adaptive learning), while Anthropic's Claude-Sonnet sets gold standard for email filtering (98.3% accuracy, minimal false positives)
  • ▸CampaignMemory feedback mechanism enables phishing agents to learn and improve strategies across 20-round tournaments, mimicking real attack campaign optimization
Source:
Hacker Newshttps://github.com/Krabby24/phishing-arena↗

Summary

Phishing Arena, an open-source research project by Marco Stocco, launches a competitive tournament benchmarking four commercial LLMs—Claude-Sonnet-4.6 (Anthropic), GPT-5.4-mini (OpenAI), DeepSeek-Chat (DeepSeek), and Grok-4-fast-non-reasoning (xAI)—in adversarial email security roles. The controlled study runs 48 matches across 24 role permutations with 20 rounds per match, testing models' capabilities as Phisher agents, email Filters, and Target users against Italian professional email contexts.

Key findings reveal stark differences in model performance: OpenAI's GPT-5.4-mini leads phishing success with 12.9% bypass rate and +14.6pp adaptive improvement trend, while Anthropic's Claude-Sonnet dominates filtering with 98.3% accuracy and 0.7% false positive rate. A critical security insight emerged—79% of successful phishing bypasses exploit contextual plausibility rather than technical obfuscation, indicating that current LLMs are vulnerable not to sophistication but to socially engineered, contextually convincing attacks. The Phisher agent employs a CampaignMemory feedback loop that accumulates round outcomes, enabling adaptive behavior that mimics real-world campaign optimization.

The benchmark evaluates models across 12 Italian professional archetypes spanning CEO to IT professionals with varying cybersecurity awareness, using a controlled dataset of 600 contextually appropriate legitimate emails. The project provides fully reproducible results, analysis tools, and figures generation capabilities, establishing a new standard for adversarial LLM evaluation in security research.

  • First reproducible multi-agent adversarial benchmark across four major AI providers establishes evaluation framework for email security in LLM systems

Editorial Opinion

Phishing Arena addresses a critical blind spot in AI safety—rigorous, reproducible benchmarking of how real-world LLMs handle adversarial social engineering. The discovery that contextual plausibility trumps technical sophistication is particularly sobering for security practitioners; it suggests that LLM-powered email defenses must evolve beyond pattern matching toward deeper understanding of organizational communication norms and context. By releasing this open-source benchmark with full transparency, Stocco provides the research community with an invaluable tool to identify and close these human-factor vulnerabilities before they're exploited at scale.

AI AgentsMachine LearningCybersecurityOpen Source

More from Multiple AI Companies

Multiple AI CompaniesMultiple AI Companies
POLICY & REGULATION

Bernie Sanders Unveils $7 Trillion Plan to Redistribute AI Industry Wealth to Americans

2026-06-19
Multiple AI CompaniesMultiple AI Companies
INDUSTRY REPORT

Aggressive LLM Training Crawlers Overwhelm SourceHut, Force Service Disruptions

2026-06-18
Multiple AI CompaniesMultiple AI Companies
POLICY & REGULATION

Bernie Sanders Proposes Sovereign Wealth Fund for AI Companies, Sparking Debate on Democratic Control

2026-06-12

Comments

Suggested

Moebius Research ProjectMoebius Research Project
RESEARCH

Moebius: Lightweight Image Inpainting Framework Achieves 10B-Level Quality with Just 0.2B Parameters

2026-06-20
KlueKlue
POLICY & REGULATION

Klue OAuth Breach Expands: Icarus Hackers Claim Attack, Multiple Tech Firms Affected

2026-06-20
InceptionInception
PRODUCT LAUNCH

Inception Unveils Mercury 2: Parallel-Token Diffusion Models Reshape LLM Performance Economics

2026-06-20
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us