BotBeat
...
← Back

> ▌

AnthropicAnthropic
RESEARCHAnthropic2026-04-15

Anthropic's Claude Code vs. OpenAI's Codex: Security Defaults Reveal Different AI Coding Philosophies

Key Takeaways

  • ▸Claude Code favors explicit, well-known security libraries (bcrypt), while Codex relies more on standard library implementations and custom cryptographic code, reflecting different trust models in AI-assisted development
  • ▸Neither model volunteered rate limiting or brute-force protection by default, despite these being common security requirements, indicating a gap in AI safety defaults for authentication systems
  • ▸Framework choice had a larger impact on security compliance than model choice, with FastAPI providing better middleware-based protections (96% vs. 73% on Next.js), suggesting developers should pair AI coding tools with frameworks that enforce secure defaults
Source:
Hacker Newshttps://amplifying.ai/research/ai-security-decisions↗

Summary

A new security benchmark comparing Anthropic's Claude Code and OpenAI's Codex reveals fundamental differences in how these AI code generation tools approach security by default. Researchers at Anthropic tested both models against six common development tasks—authentication, file uploads, search, admin controls, webhooks, and production configuration—using deliberately security-silent prompts to measure what security decisions the AI would make unprompted. The results showed Claude Code consistently chose established security libraries like bcrypt for password hashing across all six sessions, while Codex opted for standard library implementations (PBKDF2, scrypt) and even built JWT encoding from scratch in two sessions using raw HMAC.

Both models exhibited the same critical omissions: neither volunteered rate limiting on login endpoints or security headers, leaving brute-force protection and reconnaissance vectors unaddressed. The benchmark ran 33 exploit tests across 12 total sessions spanning FastAPI and Next.js frameworks, with frameworks themselves accounting for the largest performance gap (FastAPI at 96% compliance vs. Next.js at 73%). Beyond the scorecards, the research highlights that many application security vulnerabilities stem not from exotic attacks but from mundane, unrequested decisions—which hash function to use, whether production still serves API documentation, and whether login endpoints ever throttle. Codex shipped Swagger UI in production and exposed /openapi.json in all sessions, creating reconnaissance opportunities that static analysis missed but dynamic testing caught.

  • API documentation exposure (/openapi.json, Swagger UI in production) was consistently overlooked by both models, demonstrating that reconnaissance vectors matter as much as injection vulnerabilities in real-world security posture

Editorial Opinion

This benchmark is a crucial reality check for AI-assisted development in security-critical contexts. While both Claude and Codex can identify zero-day vulnerabilities in existing code, their approach to writing new code reveals that AI models still lack consistent security intuition for mundane but essential hardening decisions. The research sensibly frames this not as a scorecard but as a lens on the quiet decisions that accumulate into risk—a framing the industry should adopt more broadly. Organizations deploying these tools need to treat AI-generated code as a starting point requiring human security review, not a finished product, and should pair them with frameworks that provide security by default.

Large Language Models (LLMs)CybersecurityAI Safety & Alignment

More from Anthropic

AnthropicAnthropic
RESEARCH

Security Researchers Demonstrate C2-Like Attacks Using Anthropic's Claude Code Background Agents

2026-06-01
AnthropicAnthropic
RESEARCH

Anthropic Publishes Guide to Using Claude for Enterprise Vulnerability Discovery

2026-06-01
AnthropicAnthropic
INDUSTRY REPORT

The Agentic Mesh: Rethinking How AI Agents Should Scale Into Business Systems

2026-05-31

Comments

Suggested

MinimaxMinimax
PRODUCT LAUNCH

MiniMax Debuts M3: Flagship AI Model for Complex Coding Tasks

2026-06-01
NVIDIANVIDIA
PRODUCT LAUNCH

NVIDIA Releases Nemotron 3 Super: Open-Source 120B Hybrid Model with 2.2x Faster Inference

2026-06-01
AnthropicAnthropic
RESEARCH

Security Researchers Demonstrate C2-Like Attacks Using Anthropic's Claude Code Background Agents

2026-06-01
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us