BotBeat
...
← Back

> ▌

Multiple AI CompaniesMultiple AI Companies
RESEARCHMultiple AI Companies2026-03-03

New Research Exposes Privacy Gaps in Major AI Companies' Use of User Chat Data for Model Training

Key Takeaways

  • ▸All six major U.S. AI developers studied use user chat data for model training by default, with some retaining data indefinitely
  • ▸Companies may collect and train on sensitive personal information including biometric and health data, as well as uploaded files
  • ▸Four of six companies appear to include children's chat data in training datasets, raising significant ethical and legal concerns
Source:
Hacker Newshttps://arxiv.org/abs/2509.05382↗

Summary

A comprehensive research paper published on arXiv has analyzed the privacy policies of six leading U.S. AI companies, revealing significant concerns about how user chat data is collected and used for training large language models. The study, authored by researchers including Jennifer King, Kevin Klyman, and colleagues, found that all six frontier AI developers appear to use user chat data for model training by default, with some retaining this data indefinitely. The research employed a novel qualitative coding schema based on the California Consumer Privacy Act to systematically compare data practices across companies.

The findings raise particular alarm about the collection of sensitive personal information disclosed in chats, including biometric and health data, as well as files uploaded by users. Four of the six companies examined appear to include children's chat data in their training datasets, alongside customer data from other products. The researchers note that privacy policies often lack essential details about these practices, creating a significant transparency gap between what users understand about their data and how it's actually being used.

The paper addresses critical implications including the lack of meaningful user consent for chat data usage in model training, data security risks from indefinite retention periods, and ethical concerns around training on children's data. The authors conclude with recommendations for both policymakers and developers to address these privacy challenges. This research comes at a crucial time as hundreds of millions of people worldwide now regularly interact with LLM-powered chatbots, often sharing personal and sensitive information without full awareness of how it may be repurposed.

  • Privacy policies consistently lack essential transparency about data collection and usage practices
  • The research provides specific recommendations for policymakers and developers to address LLM privacy challenges

Editorial Opinion

This research delivers a sobering reality check for the AI industry's approach to user privacy. While companies race to improve their models with ever-larger datasets, the default practice of training on user conversations—including sensitive personal information and children's data—without explicit, informed consent represents a fundamental misalignment between business incentives and user expectations. The lack of transparency documented in this study suggests that current self-regulation is insufficient, and may strengthen arguments for comprehensive AI-specific privacy legislation that goes beyond existing frameworks like CCPA.

Large Language Models (LLMs)Science & ResearchRegulation & PolicyEthics & BiasPrivacy & Data

More from Multiple AI Companies

Multiple AI CompaniesMultiple AI Companies
INDUSTRY REPORT

Therapy Sessions Being Used to Train AI Models, Raising Privacy and Ethical Concerns

2026-04-04
Multiple AI CompaniesMultiple AI Companies
INDUSTRY REPORT

Agentic AI and the Next Intelligence Explosion: Industry Shifts Toward Autonomous Systems

2026-04-02
Multiple AI CompaniesMultiple AI Companies
INDUSTRY REPORT

Study Tracks AI Coding Tool Adoption Across Critical Open Source Projects

2026-04-01

Comments

Suggested

OracleOracle
POLICY & REGULATION

AI Agents Promise to 'Run the Business'—But Who's Liable When Things Go Wrong?

2026-04-05
AnthropicAnthropic
POLICY & REGULATION

Anthropic Explores AI's Role in Autonomous Weapons Policy with Pentagon Discussion

2026-04-05
PerplexityPerplexity
POLICY & REGULATION

Perplexity's 'Incognito Mode' Called a 'Sham' in Class Action Lawsuit Over Data Sharing with Google and Meta

2026-04-05
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us