Microsoft Copilot Researcher Introduces Multi-Model Intelligence with Critique and Council Features
Key Takeaways
- ▸Researcher now features Critique, a multi-model deep research system that separates generation from evaluation using models from Anthropic and OpenAI, delivering superior accuracy compared to single-model approaches
- ▸Critique achieves +7.0 point improvement on the DRACO benchmark and outperforms Perplexity Deep Research by 13.88%, demonstrating best-in-class deep research quality
- ▸Council allows users to compare multiple model responses side-by-side with detailed insights on agreement points and divergences across different AI models
Summary
Microsoft has announced significant enhancements to Researcher, its deep research agent within Microsoft 365 Copilot, introducing two new multi-model capabilities: Critique and Council. Critique employs a dual-model architecture that separates generation from evaluation, combining models from frontier AI labs including Anthropic and OpenAI. One model handles planning, retrieval, and initial draft creation, while a second model acts as an expert reviewer, focusing on validation and refinement before producing the final report. This approach has demonstrated substantial performance improvements on the DRACO benchmark, achieving a +7.0 point improvement and outperforming competing systems by 13.88%.
Council brings a complementary feature that displays multiple model responses side-by-side within the Researcher experience, along with a cover letter that highlights areas of agreement, divergence, and unique insights from each model. The Critique system incorporates rubric-based evaluation similar to academic and professional research workflows, focusing on source reliability assessment, report completeness, and strict evidence grounding enforcement. By emphasizing evaluation as much as generation, the architecture creates a powerful feedback loop designed to enhance factual accuracy, analytical breadth, and overall presentation quality.
- The Critique architecture implements academic-style peer review processes with focus on source reliability, report completeness, and strict evidence grounding
Editorial Opinion
Microsoft's move to implement multi-model intelligence in Researcher represents a thoughtful advancement in AI-assisted research, moving beyond the single-model paradigm that has dominated the space. By leveraging both generative and evaluative capabilities through separate models, Microsoft has created a system that mirrors established academic review practices while leveraging the strengths of multiple frontier models. The substantial performance gains on the DRACO benchmark suggest that this architecture addresses real quality gaps in research synthesis, though the long-term value will depend on how well users integrate these insights into their actual research workflows.



