xAI Brings Grok Text-to-Speech and Speech-to-Text to Puter.js with Free Developer Access
Key Takeaways
- ▸xAI's TTS and STT APIs are now integrated into Puter.js with zero cost and no registration barriers
- ▸Text-to-Speech features five distinct voices with expressive inline speech tags for nuanced audio delivery
- ▸Speech-to-Text includes advanced features like speaker diarization, word-level timestamps, and multichannel support
Summary
xAI has made its Text-to-Speech and Speech-to-Text APIs available through Puter.js, providing developers with free access to advanced voice capabilities without requiring API keys or registration. The TTS offering includes five distinct voices (Eve, Rex, Sal, Leo) with support for inline speech tags like [pause], [laugh], and <whisper> tags to add expressive delivery to generated audio, with a maximum of 15,000 characters per request.
The Speech-to-Text service provides accurate transcription with word-level timestamps, speaker diarization to identify individual speakers, and multichannel audio support. Both APIs are fully integrated into Puter.js and can be deployed with a single npm install or script tag, requiring no infrastructure setup or configuration. The integration democratizes access to xAI's voice technology, making it available to developers building web applications.
- Developers can integrate the APIs with minimal setup using either npm or a simple script tag


