BotBeat
...
← Back

> ▌

GitHubGitHub
POLICY & REGULATIONGitHub2026-03-26

GitHub Reverses Course, Will Train AI Models on User Data by Default Starting April 24

Key Takeaways

  • ▸GitHub will begin training AI models on user data (code snippets, inputs, outputs, and context) starting April 24, 2024, by default for free and paid Copilot tiers
  • ▸The policy uses an opt-out model, allowing users to disable data training in privacy settings, though this differs from stricter opt-in requirements in Europe
  • ▸Private repositories are no longer fully private when users have Copilot enabled, as code snippets can be collected for AI training purposes
Source:
Hacker Newshttps://www.theregister.com/2026/03/26/github_ai_training_policy_changes/↗

Summary

Microsoft's GitHub announced it will begin using customer interaction data to train its AI models starting April 24, 2024, marking a significant policy reversal. The change applies to Copilot Free, Pro, and Pro+ users, while Copilot Business and Enterprise customers remain exempt. The data collection includes code snippets, inputs, outputs, file names, comments, and user interactions with Copilot features from both public and private repositories.

Users can opt out by visiting their privacy settings, following an opt-out model rather than the opt-in requirements typical in Europe. GitHub's Chief Product Officer Mario Rodriguez argued the data collection will improve model accuracy and code suggestions. However, the policy shift has generated significant community backlash, with users expressing skepticism about the rebranding of "private" repositories and concerns about consent, despite the fact that GitHub Copilot's underlying Codex model was already trained on publicly available GitHub code.

  • The move has generated substantial community pushback, with GitHub users expressing concerns about consent and data privacy despite GitHub's claim that similar practices are industry standard

Editorial Opinion

GitHub's reversal on AI training data represents a troubling normalization of data extraction in the AI industry. While the company frames this as an opt-out choice, the fundamental issue remains: developers who depend on Copilot must actively resist to protect their code, rather than being asked for explicit consent. The fact that GitHub's own Copilot was built on previously scraped GitHub code illustrates how the AI industry has systematically extracted value from developers without meaningful consent—a precedent that makes this new policy feel inevitable rather than justified.

Generative AIEthics & BiasPrivacy & Data

More from GitHub

GitHubGitHub
UPDATE

Kimi K2.7 Code Now Available in GitHub Copilot as First Open-Weight Model Option

2026-07-02
GitHubGitHub
UPDATE

GitHub Copilot Code Review Launches Medium-Depth Analysis in Public Preview

2026-07-02
GitHubGitHub
PRODUCT LAUNCH

GitHub Launches Native Copilot App for Agent-Driven Development on macOS, Windows, and Linux

2026-06-19

Comments

Suggested

MicrosoftMicrosoft
RESEARCH

Microsoft's Leaked 'Aion' Project Reveals Vision for Copilot-First Operating System

2026-07-04
OpenAIOpenAI
INDUSTRY REPORT

Investigation Uncovers AI-Generated Deepfakes in Lily Jay Foundation Charity Fraud

2026-07-04
AppleApple
RESEARCH

Researchers Discover Six Vulnerabilities in Apple AirDrop and Google/Samsung Quick Share Protocols

2026-07-04
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us