Researchers Achieve Near-Zero Hallucination in Bilingual English-Hindi LLMs Through Citation Grounding
Key Takeaways
- ▸XKD-Dial achieves 0.0% hallucination in encoder-decoder models through citation-grounded supervised fine-tuning
- ▸Progressive training pipeline successfully scales bilingual capabilities to Hindi while preventing catastrophic forgetting
- ▸Smaller models (1B parameters) can match larger models (7B) on English after fine-tuning with explicit citation grounding
Summary
A new research paper presents XKD-Dial, a progressive four-stage training pipeline designed to reduce hallucination in large language models while maintaining explainability through citation grounding in both English and Hindi. The approach combines multilingual adaptation, supervised fine-tuning with citation mechanisms, and reinforcement learning with citation-aware rewards across six different model architectures ranging from 250M to 7B parameters.
The key breakthrough is achieving 0.0% hallucination rates for encoder-decoder models from the second training stage onward, while maintaining transparent decision-making through three post-hoc explainability analyses: cross-attention alignment, Integrated Gradients attribution, and occlusion-based causal grounding. The research demonstrates that smaller models can match larger models' performance on English tasks after supervised fine-tuning, and the progressive pipeline prevents catastrophic forgetting while improving Hindi language capabilities.
The study evaluates performance across six automatic metrics including hallucination rate, Citation-F1, and FactScore, finding that well-designed supervised fine-tuning provides comparable or superior results to reinforcement learning from human feedback (GRPO) for structured citation tasks. This work addresses a significant gap in knowledge-grounded dialogue systems, which have historically focused on English and lacked explicit citation mechanisms for verifying factual claims.
- Post-hoc explainability analysis reveals how citation behavior is learned, not just whether it is learned
- Supervised fine-tuning with citation rewards outperforms GRPO alignment for structured citation tasks
Editorial Opinion
This research represents a meaningful advance in addressing one of LLMs' most persistent problems—hallucination—particularly in multilingual settings. By combining citation grounding with explainability mechanisms, the work moves beyond simply reducing errors to creating verifiable, transparent dialogue systems. The achievement of zero hallucination rates in specific model classes is noteworthy, though the practical applicability of these techniques to larger, more complex models remains an open question. The focus on under-resourced language pairs like English-Hindi is commendable and could serve as a template for extending these methods to other low-resource languages.


