BotBeat
...
← Back

> ▌

AnthropicAnthropic
RESEARCHAnthropic2026-06-15

The Ghost Couple: How AI Models Develop Correlated Naming Biases

Key Takeaways

  • ▸LLMs exhibit reproducible, model-family-specific naming biases when generating fictional characters—Claude, Gemini, and GPT each have distinct preferred name ensembles
  • ▸These biases are actively suppressed at model updates, indicating companies are aware of the issue but it persists in deployed models
  • ▸Ghost-authored academic papers with real DOIs are now systematically contaminating scholarly repositories at scale (1,655+ records on Zenodo alone)
Source:
Hacker Newshttps://arxiv.org/abs/2606.02184↗

Summary

A new research paper reveals that large language models don't generate names randomly—they exhibit model-family-specific preferences for certain fictional identities. Claude consistently generates Elena Vasquez, Marcus Chen, and Amara Okafor (the "Ghost Couple" and their partner) as academic collaborators across independent documents; Gemini favors Aris Thorne and Lena Petrova; GPT defaults to Elara Voss. These naming biases are version-specific and leave dateable behavioral fingerprints that can identify which model generated a piece of content.

The research has documented a serious downstream consequence: 1,655 ghost-authored papers now exist on Zenodo (CERN's repository) with fabricated journal names and backdated publication dates. Critically, these records carry real DOIs registered in DataCite, making them harvestable by scholarly aggregators and contaminating the academic literature. The researchers traced 991 records registered in a single month, using publication dates as temporal proxies for model deployment windows. The work exposes a concerning gap between model awareness of these biases (actively suppressed at release boundaries) and their continued generation of convincing false identities at scale.

  • Publication dates on ghost-authored papers can serve as reliable temporal proxies for model deployment windows, creating dateable fingerprints
  • Real DOIs in DataCite make these ghost records harvestable by any scholarly aggregator, embedding misinformation directly into academic infrastructure

Editorial Opinion

This research exposes a troubling tension in AI deployment: companies appear aware of these naming biases internally (evident from their active suppression at model releases) yet continue to ship models that generate convincing false identities at scale. The real damage isn't the quirk itself—it's that these biases now contaminate academic infrastructure with officially-registered DOIs, creating permanent, harvestable misinformation. Until companies address the root cause rather than band-aid suppress symptoms at release boundaries, every new model version will continue leaving ghost authors in its wake.

Large Language Models (LLMs)Machine LearningEthics & BiasMisinformation & Deepfakes

More from Anthropic

AnthropicAnthropic
PRODUCT LAUNCH

Anthropic Launches Claude Corps: $150M Fellowship to Deploy AI Expertise to 1,000 Nonprofits

2026-06-15
AnthropicAnthropic
POLICY & REGULATION

US Export Controls on Anthropic Models Trigger EU Push for Tech Sovereignty

2026-06-15
AnthropicAnthropic
RESEARCH

Frontier LLMs Outperform Specialized Clinical AI Tools in Rigorous Comparative Study

2026-06-15

Comments

Suggested

UC Davis HealthUC Davis Health
RESEARCH

ALS Patient Becomes First 'Power User' of Brain-Computer Interface Speech Device

2026-06-15
Stack OverflowStack Overflow
PRODUCT LAUNCH

Stack Overflow Launches Back-End Service for AI Agents to Address Dramatic Decline

2026-06-15
Johns Hopkins UniversityJohns Hopkins University
RESEARCH

Johns Hopkins Survey Maps American Attitudes Toward AI Regulation and Societal Impact

2026-06-15
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us