AI Systems Compete to Optimize Code: Grok and Claude Achieve 8x Performance Improvements Through Assembly Rewriting
Key Takeaways
- ▸Both Claude and Grok successfully optimized code through iterative prompting, with neither AI dominating across all rounds
- ▸SIMD vectorization using NEON instructions proved critical to achieving 8x performance improvements over baseline C++ implementations
- ▸The optimizations translated effectively to C with SIMD intrinsics, indicating the AIs discovered genuine algorithmic improvements, not assembly-specific tricks
Summary
In a performance optimization challenge, researcher Daniel Lemire tasked two AI systems—Anthropic's Claude and xAI's Grok—with rewriting a simple C++ string character-counting function into optimized ARM64 assembly code. Through iterative prompting and refinement, both models progressively improved their solutions, moving from basic byte-by-byte comparisons to sophisticated SIMD-optimized implementations using NEON instructions, ultimately achieving an 8-fold reduction in instruction count and execution time.
The competition saw Claude and Grok alternate producing increasingly efficient versions, with the most advanced iterations leveraging 64-byte chunks and multiple accumulators for parallel processing. Remarkably, when the optimal assembly implementations were translated back into C using SIMD intrinsics, the performance gains persisted—demonstrating that the AI systems discovered genuine algorithmic optimizations rather than relying on assembly-specific tricks. Lemire's findings suggest that modern AI can effectively identify low-level optimization opportunities that rival or exceed traditional compiler outputs.
- AI-assisted code optimization may offer a path to surpass traditional compiler capabilities, at least for specific benchmarks
Editorial Opinion
This exercise reveals both the promise and limitations of current AI systems in systems programming. While Claude and Grok impressively identified SIMD optimization opportunities that outperformed standard C++ compilers, the need for iterative human-guided prompting and the reliance on CPU-specific knowledge suggests AI is still a tool augmenting—not replacing—expert performance optimization. The open question of whether AIs can discover optimizations impossible in higher-level languages remains tantalizing but unanswered.


