BotBeat
...
← Back

> ▌

AnthropicAnthropic
OPEN SOURCEAnthropic2026-03-23

VisionClaude: Open-Source AI Vision Platform Brings Claude to iPhone and Meta Ray-Ban Glasses

Key Takeaways

  • ▸Open-source vision AI platform enables real-time multimodal interaction with Claude through iPhone and Meta Ray-Ban glasses
  • ▸Supports continuous hands-free conversation with on-device speech recognition and streaming TTS output
  • ▸Integrates with Claude's MCP ecosystem and custom skills for autonomous tool execution based on visual and voice input
Source:
Hacker Newshttps://github.com/mrdulasolutions/visionclaude↗

Summary

VisionClaude, an open-source project built by mrdulasolutions, transforms iPhones and Meta Ray-Ban Smart Glasses into multimodal interfaces for Claude by enabling real-time visual input and voice interaction. Users can point their device's camera at objects or scenes, speak naturally, and Claude analyzes what it sees while executing connected tools and skills through a gateway server. The system captures video at 1080p from iPhones or 720p from Ray-Ban glasses, processes images through Claude's API, and responds via ElevenLabs text-to-speech with 10 selectable voices.

The platform leverages Apple's on-device speech recognition for privacy-first audio input and supports both Claude's Model Context Protocol (MCP) servers and custom skills that are automatically discovered and injected into Claude's system prompt. Developers can add local or remote MCP servers for integrations with email, calendar, Slack, Google Calendar, and other services, enabling Claude to take actions based on visual analysis and natural language commands.

  • High-performance video processing at 1080p/30fps for iPhone and 720p/30fps for Ray-Ban with 85% JPEG quality for accurate text and object recognition

Editorial Opinion

VisionClaude represents a compelling open-source approach to bringing vision capabilities to Claude on consumer hardware, democratizing multimodal AI beyond proprietary platforms. By leveraging existing iPhone and Ray-Ban hardware alongside Claude's tool-use capabilities, the project creates practical hands-free workflows for real-world applications. However, the reliance on a local gateway server and multiple API keys (Anthropic, ElevenLabs) may present friction for mainstream adoption, and privacy considerations around continuous camera access warrant careful user attention.

Computer VisionNatural Language Processing (NLP)Multimodal AIAI Agents

More from Anthropic

AnthropicAnthropic
RESEARCH

Inside Claude Code's Dynamic System Prompt Architecture: Anthropic's Complex Context Engineering Revealed

2026-04-05
AnthropicAnthropic
POLICY & REGULATION

Anthropic Explores AI's Role in Autonomous Weapons Policy with Pentagon Discussion

2026-04-05
AnthropicAnthropic
POLICY & REGULATION

Security Researcher Exposes Critical Infrastructure After Following Claude's Configuration Advice Without Authentication

2026-04-05

Comments

Suggested

AnthropicAnthropic
RESEARCH

Inside Claude Code's Dynamic System Prompt Architecture: Anthropic's Complex Context Engineering Revealed

2026-04-05
OracleOracle
POLICY & REGULATION

AI Agents Promise to 'Run the Business'—But Who's Liable When Things Go Wrong?

2026-04-05
AnthropicAnthropic
POLICY & REGULATION

Anthropic Explores AI's Role in Autonomous Weapons Policy with Pentagon Discussion

2026-04-05
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us