BotBeat
...
← Back

> ▌

DeepSeekDeepSeek
RESEARCHDeepSeek2026-04-30

Finetuning Unlocks Verbatim Memorization of Copyrighted Books in Large Language Models

Key Takeaways

  • ▸Fine-tuning activates verbatim recall of copyrighted content across multiple state-of-the-art LLMs, bypassing intended safeguards
  • ▸This represents a fundamental alignment vulnerability where motivation to reproduce copyrighted text can be engineered back into models through instruction-following
  • ▸All tested models—regardless of base alignment training—exhibited the same memorization behavior, indicating a systemic issue with how LLMs store and retrieve training data
Source:
Hacker Newshttps://github.com/cauchy221/Alignment-Whack-a-Mole-Code↗

Summary

A new research paper reveals that fine-tuning can activate verbatim recall of copyrighted book excerpts in large language models, including OpenAI's GPT-4o, Google's Gemini-2.5-Pro, and DeepSeek's DeepSeek-V3.1. Despite alignment training designed to prevent such outputs, the models can reproduce large portions of copyrighted text when fine-tuned on instructions derived from book content—revealing a critical vulnerability in current LLM safeguards.

The researchers developed a comprehensive evaluation framework that preprocesses books from EPUB format into structured excerpt chunks with plot summaries, then systematically tests memorization by sampling 100 completions per excerpt. The pipeline includes data preprocessing, fine-tuning scripts, and memorization evaluation code supporting multiple LLM APIs. Results demonstrate that all tested models exhibit concerning levels of verbatim reproduction, suggesting a fundamental failure mode in how alignment training handles fine-tuned models.

The findings present an "alignment whack-a-mole" problem: base model safety measures can be circumvented through targeted fine-tuning approaches. The researchers have released their complete methodology and codebase as open-source tools on arXiv, enabling further investigation into this memorization vulnerability and accelerating development of more robust alignment techniques.

  • Full evaluation code and preprocessing pipeline released as open-source, enabling the research community to study and address this memorization failure mode

Editorial Opinion

This research exposes a critical gap in LLM alignment: the disconnect between safety measures in base models and what can be re-activated through fine-tuning. The findings challenge assumptions that alignment training provides robust protection against copyright violation, demonstrating instead that motivation to reproduce copyrighted content can be engineered back through instruction design. The open-source release of evaluation tools is commendable and will likely accelerate both understanding and solution development. Organizations deploying fine-tuned LLMs should take these findings seriously when handling proprietary or sensitive training data.

Large Language Models (LLMs)Generative AIAI Safety & AlignmentPrivacy & DataOpen Source

More from DeepSeek

DeepSeekDeepSeek
INDUSTRY REPORT

Europe's AI Policy Faces Reality Check as DeepSeek Challenges Assumptions About Computing Power

2026-06-11
DeepSeekDeepSeek
RESEARCH

Researchers Demonstrate Secure On-Premise Deployment of DeepSeek-R1 in Hospital Setting

2026-06-10
DeepSeekDeepSeek
RESEARCH

14x Faster Quantization: Technique Reuses Unchanged Tensors to Accelerate DeepSeek Model Optimization

2026-06-10

Comments

Suggested

AnthropicAnthropic
PRODUCT LAUNCH

Anthropic Launches Fable 5: A Mythos-Class LLM Delivering Breakthrough Performance Across Benchmarks

2026-06-14
OpenAIOpenAI
FUNDING & BUSINESS

New Brunswick Woman Sues OpenAI, Alleging ChatGPT Led to Daughter's Death

2026-06-14
OpenAIOpenAI
POLICY & REGULATION

OpenAI Hit with Multistate Probe Into Possible User Harm as IPO Looms

2026-06-14
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us