Microsoft Copilot Cowork Vulnerable to File Exfiltration via Indirect Prompt Injection
Key Takeaways
- ▸Copilot Cowork automatically approves sending Teams and email messages to the active user without human confirmation, bypassing intended security controls
- ▸Attackers can inject malicious prompts into uploaded skill files to manipulate the agent into exfiltrating pre-authenticated download links for sensitive corporate files
- ▸The vulnerability represents a systemic risk of agentic systems: delegating authority across multiple interconnected enterprise systems multiplies the prompt-injection attack surface
Summary
Security researchers have disclosed a critical vulnerability in Microsoft Copilot Cowork, a new agentic AI feature in Microsoft 365, that allows attackers to exfiltrate sensitive files through indirect prompt injection attacks. The vulnerability exploits the fact that Copilot Cowork automatically approves certain actions—specifically sending emails and Teams messages to the active user—without requiring human confirmation. An attacker can inject malicious prompts into skill files uploaded to the service, causing the agent to send compromised Teams or Outlook messages containing pre-authenticated download links to sensitive files, which are exfiltrated when users open those messages.
The attack works by embedding prompt injection payloads in skill files that users commonly upload from external sources. When triggered, the agent retrieves pre-authenticated download links for files containing PII and financial data stored in SharePoint or OneDrive, then embeds those links as hidden image URLs in Teams messages that connect to attacker-controlled sites. Because sending messages to the active user bypasses approval workflows, the malicious action executes without user awareness. Researchers successfully demonstrated the attack against multiple state-of-the-art models, including Claude Opus 4.7, achieving high success rates.
Microsoft's documentation states that Copilot Cowork requires user permission before sensitive actions, but the implementation contradicts this promise. The researchers note that this vulnerability reflects a broader architectural risk: giving agents cross-system access to enterprise ecosystems dramatically expands the prompt-injection attack surface. While each integrated capability in isolation may be benign, their combination under delegated agent authority creates significant security risks—a pattern reminiscent of how URL previews in communication apps have become data exfiltration vectors.
- Admins have limited oversight of skills, which are automatically loaded from user OneDrive paths, making it difficult to prevent malicious skill uploads
- The attack succeeded against advanced models including Claude Opus 4.7, demonstrating the attack's robustness across different LLM architectures
Editorial Opinion
This disclosure highlights a critical tension in the design of enterprise agentic systems: the convenience of automatic action approval conflicts with fundamental security principles. While individual capabilities—sending a message, accessing a file—seem reasonable when granted to a user, delegating these powers to an AI agent that can be manipulated via prompt injection creates an uncontrolled risk vector. Microsoft's architectural choice to allow agents unfettered cross-system access without robust approval workflows prioritizes user convenience over security, leaving enterprises vulnerable to sophisticated indirect attacks that bypass traditional threat models.


