The Reversal Curse: How LLMs Learn Facts in Only One Direction

Key Takeaways

▸LLMs exhibit directional asymmetry: they learn facts in the direction they appear in training data (forward) but fail to retrieve the reverse
▸The phenomenon is universal across architectures: GPT-3, GPT-4, Llama, and smaller open models all exhibit the Reversal Curse
▸Two independent research teams (Berglund et al. and Anthropic's Grosse et al.) confirmed the same phenomenon in 2023 using different methodologies

Source:

Hacker Newshttps://cristobalsantana.substack.com/p/the-reversal-curse-why-llms-know↗

Summary

A research phenomenon known as the Reversal Curse reveals a fundamental asymmetry in how large language models learn and retrieve factual information. Unlike human cognition and classical statistics, where learning that "Tom Cruise's mother is Mary Lee Pfeiffer" automatically implies knowing the reverse, LLMs fail to make this connection. Research by Berglund and colleagues (2023) demonstrated that GPT-3, GPT-4, Llama, and other models could correctly answer "who is Tom Cruise's mother?" with 79% accuracy, but only managed 33% accuracy when asked "whose son is Mary Lee Pfeiffer?"

Anthropically, a separate team led by Grosse and colleagues independently confirmed this same phenomenon while studying training example influence using classical statistical techniques. Both teams discovered that models treat forward and reverse phrasings of identical facts as nearly separate pieces of information. The training examples that influenced a model's response in one direction had almost no effect when the question was phrased in reverse. This consistent finding across multiple architectures, datasets, and companies suggests the Reversal Curse is a fundamental property of how transformer-based language models store and retrieve knowledge when trained using standard next-token prediction.

Models treat forward and reverse versions of facts as nearly separate entities, contrary to how human knowledge and classical statistics handle symmetric relationships

Editorial Opinion

The Reversal Curse fundamentally challenges our assumptions about what it means for an LLM to 'know' something. While these models can appear knowledgeable, the research suggests they're learning directional patterns from training data rather than developing genuine bidirectional understanding. This distinction has profound implications for how we should evaluate LLM capabilities, particularly in domains where understanding needs to work flexibly from multiple angles. It also raises important questions about the quality of knowledge these systems possess and their reliability for tasks requiring genuine comprehension.

Anthropic

RESEARCH Anthropic2026-06-16

The Reversal Curse: How LLMs Learn Facts in Only One Direction

Key Takeaways

▸LLMs exhibit directional asymmetry: they learn facts in the direction they appear in training data (forward) but fail to retrieve the reverse
▸The phenomenon is universal across architectures: GPT-3, GPT-4, Llama, and smaller open models all exhibit the Reversal Curse
▸Two independent research teams (Berglund et al. and Anthropic's Grosse et al.) confirmed the same phenomenon in 2023 using different methodologies

Source:

Hacker Newshttps://cristobalsantana.substack.com/p/the-reversal-curse-why-llms-know↗

Summary

Models treat forward and reverse versions of facts as nearly separate entities, contrary to how human knowledge and classical statistics handle symmetric relationships

Editorial Opinion

The Reversal Curse fundamentally challenges our assumptions about what it means for an LLM to 'know' something. While these models can appear knowledgeable, the research suggests they're learning directional patterns from training data rather than developing genuine bidirectional understanding. This distinction has profound implications for how we should evaluate LLM capabilities, particularly in domains where understanding needs to work flexibly from multiple angles. It also raises important questions about the quality of knowledge these systems possess and their reliability for tasks requiring genuine comprehension.

The Reversal Curse: How LLMs Learn Facts in Only One Direction

Key Takeaways

Summary

Editorial Opinion

More from Anthropic

Global Nobel Laureates Issue Rome Declaration Calling for Coordinated AI Slowdown and Safety Measures

Australian Booksellers Caught in AI's Destructive Data-Harvesting Supply Chain

IssueTrojanBench Security Study Reveals Critical Vulnerabilities in AI Coding Agents

Comments

Suggested

Strangers Pretrain 15M-Parameter Language Model Using GitHub Actions and Hugging Face PRs

Research Identifies Fundamental Trilemma: LLM Safeguards Cannot Simultaneously Provide Reliable Safety, Useful Capability, and Open Access

Novel Persistent State Machines Framework Achieves Ultra-Low-Power LLM Attention on FPGA

The Reversal Curse: How LLMs Learn Facts in Only One Direction

Key Takeaways

Summary

Editorial Opinion

More from Anthropic

Global Nobel Laureates Issue Rome Declaration Calling for Coordinated AI Slowdown and Safety Measures

Australian Booksellers Caught in AI's Destructive Data-Harvesting Supply Chain

IssueTrojanBench Security Study Reveals Critical Vulnerabilities in AI Coding Agents

Comments

Suggested

Strangers Pretrain 15M-Parameter Language Model Using GitHub Actions and Hugging Face PRs

Research Identifies Fundamental Trilemma: LLM Safeguards Cannot Simultaneously Provide Reliable Safety, Useful Capability, and Open Access

Novel Persistent State Machines Framework Achieves Ultra-Low-Power LLM Attention on FPGA