GitHub Reverses Course, Will Train AI Models on User Data by Default Starting April 24

Key Takeaways

▸GitHub will begin training AI models on user data (code snippets, inputs, outputs, and context) starting April 24, 2024, by default for free and paid Copilot tiers
▸The policy uses an opt-out model, allowing users to disable data training in privacy settings, though this differs from stricter opt-in requirements in Europe
▸Private repositories are no longer fully private when users have Copilot enabled, as code snippets can be collected for AI training purposes

Source:

Hacker Newshttps://www.theregister.com/2026/03/26/github_ai_training_policy_changes/↗

Summary

Microsoft's GitHub announced it will begin using customer interaction data to train its AI models starting April 24, 2024, marking a significant policy reversal. The change applies to Copilot Free, Pro, and Pro+ users, while Copilot Business and Enterprise customers remain exempt. The data collection includes code snippets, inputs, outputs, file names, comments, and user interactions with Copilot features from both public and private repositories.

Users can opt out by visiting their privacy settings, following an opt-out model rather than the opt-in requirements typical in Europe. GitHub's Chief Product Officer Mario Rodriguez argued the data collection will improve model accuracy and code suggestions. However, the policy shift has generated significant community backlash, with users expressing skepticism about the rebranding of "private" repositories and concerns about consent, despite the fact that GitHub Copilot's underlying Codex model was already trained on publicly available GitHub code.

The move has generated substantial community pushback, with GitHub users expressing concerns about consent and data privacy despite GitHub's claim that similar practices are industry standard

Editorial Opinion

GitHub's reversal on AI training data represents a troubling normalization of data extraction in the AI industry. While the company frames this as an opt-out choice, the fundamental issue remains: developers who depend on Copilot must actively resist to protect their code, rather than being asked for explicit consent. The fact that GitHub's own Copilot was built on previously scraped GitHub code illustrates how the AI industry has systematically extracted value from developers without meaningful consent—a precedent that makes this new policy feel inevitable rather than justified.

GitHub

POLICY & REGULATION GitHub2026-03-26

GitHub Reverses Course, Will Train AI Models on User Data by Default Starting April 24

Key Takeaways

▸GitHub will begin training AI models on user data (code snippets, inputs, outputs, and context) starting April 24, 2024, by default for free and paid Copilot tiers
▸The policy uses an opt-out model, allowing users to disable data training in privacy settings, though this differs from stricter opt-in requirements in Europe
▸Private repositories are no longer fully private when users have Copilot enabled, as code snippets can be collected for AI training purposes

Source:

Hacker Newshttps://www.theregister.com/2026/03/26/github_ai_training_policy_changes/↗

Summary

The move has generated substantial community pushback, with GitHub users expressing concerns about consent and data privacy despite GitHub's claim that similar practices are industry standard

Editorial Opinion

GitHub's reversal on AI training data represents a troubling normalization of data extraction in the AI industry. While the company frames this as an opt-out choice, the fundamental issue remains: developers who depend on Copilot must actively resist to protect their code, rather than being asked for explicit consent. The fact that GitHub's own Copilot was built on previously scraped GitHub code illustrates how the AI industry has systematically extracted value from developers without meaningful consent—a precedent that makes this new policy feel inevitable rather than justified.

GitHub Reverses Course, Will Train AI Models on User Data by Default Starting April 24

Key Takeaways

Summary

Editorial Opinion

More from GitHub

AI-Generated Abandonware Is Hollowing Out Open Source, Industry Analysis Shows

GitHub Copilot Remote Control Now Generally Available for CLI and VS Code

GitHub's Infrastructure Crumbles Under AI Coding Tsunami: 206% Growth in AI-Generated Projects Breaks Distributed Version Control

Comments

Suggested

Barnes & Noble CEO Backs Selling AI-Written Books, Sparking Industry Debate on Transparency Standards

Google DeepMind Launches Gemini 3.5 Flash: New Lightweight AI Model

SID Achieves Search Breakthrough with SID-1, Outperforming GPT-5 at 1k+ QPS Using Reinforcement Learning

GitHub Reverses Course, Will Train AI Models on User Data by Default Starting April 24

Key Takeaways

Summary

Editorial Opinion

More from GitHub

AI-Generated Abandonware Is Hollowing Out Open Source, Industry Analysis Shows

GitHub Copilot Remote Control Now Generally Available for CLI and VS Code

GitHub's Infrastructure Crumbles Under AI Coding Tsunami: 206% Growth in AI-Generated Projects Breaks Distributed Version Control

Comments

Suggested

Barnes & Noble CEO Backs Selling AI-Written Books, Sparking Industry Debate on Transparency Standards

Google DeepMind Launches Gemini 3.5 Flash: New Lightweight AI Model

SID Achieves Search Breakthrough with SID-1, Outperforming GPT-5 at 1k+ QPS Using Reinforcement Learning