Sarvam AI Open-Sources 30B and 105B Models, India's First Competitive Reasoning LLMs Trained Domestically
Key Takeaways
- ▸Sarvam 30B and 105B are India's first competitive open-source LLMs, trained entirely on domestic infrastructure under the IndiaAI mission
- ▸Both models use Mixture-of-Experts architecture with 128 experts, achieving state-of-the-art performance on Indian language benchmarks while remaining globally competitive
- ▸The models are already in production, powering Sarvam's Samvaad conversational platform (30B) and Indus reasoning assistant (105B)
Summary
Indian AI company Sarvam AI has released Sarvam 30B and Sarvam 105B as open-source language models, marking a significant milestone as the first competitive Indian-built LLMs trained entirely on domestic infrastructure. Both models were developed from scratch using compute provided under India's IndiaAI mission, with training conducted entirely within the country. The models employ Mixture-of-Experts (MoE) architecture with 128 experts, designed to balance high reasoning capacity with efficient deployment across various hardware configurations.
Sarvam 105B, the larger model trained on 12 trillion tokens, demonstrates competitive performance on reasoning, programming, and agentic tasks globally while achieving state-of-the-art results on Indian language benchmarks. It powers Indus, Sarvam's AI assistant for complex reasoning workflows. The smaller Sarvam 30B, trained on 16 trillion tokens, is optimized for real-time conversational applications and currently powers Samvaad, the company's conversational agent platform. Both models support long-context inputs and feature architectural optimizations including Multi-head Latent Attention (MLA) in the 105B model and Grouped Query Attention (GQA) in the 30B variant.
The release represents a full-stack AI development effort, with Sarvam building proprietary capabilities across data curation, tokenization, model architecture, training infrastructure, and inference systems. The company emphasized its investment in synthetic data generation pipelines and allocated substantial training resources to the 10 most-spoken Indian languages. Model weights are available through AI Kosh and Hugging Face, with API access provided through Sarvam's platform, positioning these models as sovereign alternatives to foreign-developed LLMs for Indian applications.
- Full-stack development approach includes proprietary data curation, synthetic data generation, and optimization across training and inference systems
- Models are freely available via AI Kosh and Hugging Face, with API access through Sarvam's platform
Editorial Opinion
Sarvam's release represents a strategic milestone in AI sovereignty, demonstrating that competitive frontier models can be developed outside the US-China AI duopoly. The emphasis on Indian language performance alongside global competitiveness addresses a genuine market gap, as most leading LLMs remain English-centric. However, the true test will be whether these models can maintain competitive performance as frontier labs scale to multi-trillion parameter systems, and whether India's domestic compute infrastructure can support that trajectory without depending on foreign hardware supply chains.


