Seer: Open-Source Local AI Brings Accessible Image Descriptions to Web Users
Key Takeaways
- ▸Seer is an open-source browser extension that uses AI to automatically generate descriptions for images lacking alt text, improving web accessibility for screen reader users
- ▸All image processing happens locally on the user's device using Google's PaliGemma2 model—no APIs, no data transmission, and no internet required after setup
- ▸The tool addresses critical limitations of cloud-based alternatives by offering a free, private, offline solution that works for everyone, including users in low-resource environments
Summary
Seer is a new open-source browser extension that automatically generates descriptions for images lacking alt text, making the web more accessible for users of screen readers like NVDA, JAWS, and VoiceOver. The extension addresses a critical accessibility gap—most websites fail to provide proper alt text for images, leaving visually impaired users unable to access important visual content. Seer fills this gap by analyzing images and generating descriptions in real-time, silently and automatically.
The tool uses Google's PaliGemma2, a 3-billion parameter vision-language model, to process images entirely on the user's computer without sending data to remote servers. Users install the Seer daemon locally, load the browser extension, and the system automatically detects images without alt text, generates descriptions using the local AI model, and passes the output to the screen reader. The setup requires approximately 3GB of RAM and no GPU, and works completely offline after initial installation.
Seer's open-source approach (Apache 2.0 license) addresses critical limitations of existing cloud-based image description tools, which typically charge per image, transmit photos to third-party servers, and require internet connectivity. Built by Recursia Lab, Seer is free, private, and accessible to everyone—including people in low-resource environments without reliable cloud access. The project democratizes AI-powered accessibility by removing the cost and privacy barriers that have traditionally limited such tools.
- Released under Apache 2.0 license, Seer requires only ~3GB RAM and no GPU, democratizing AI-powered accessibility technology



