Meta Claims BitTorrent Uploading of Pirated Books for AI Training is Fair Use
Key Takeaways
- ▸Meta argues that uploading pirated books via BitTorrent during AI training data acquisition qualifies as fair use because uploading is inherent to the BitTorrent protocol
- ▸A court previously ruled that using pirated books to train Meta's Llama LLM was fair use, but Meta remained liable for the downloading and sharing aspect
- ▸Plaintiffs claim Meta's new fair use defense for uploading was introduced improperly after discovery deadlines, while Meta contends it was previously disclosed
Summary
Meta is defending its use of pirated books for AI training by arguing that uploading copyrighted content via BitTorrent qualifies as fair use. The company faces a class-action lawsuit from prominent authors including Richard Kadrey, Sarah Silverman, and Christopher Golden, who allege Meta downloaded pirated books from shadow libraries like Anna's Archive to train its Llama large language model. While a California federal court previously ruled that using pirated books for AI training constitutes fair use, Meta remained liable for the act of downloading and sharing these books through BitTorrent's peer-to-peer protocol.
In a recent supplemental interrogatory response, Meta introduced a new defense strategy, arguing that the automatic uploading inherent to BitTorrent technology should also be considered fair use. The company contends that BitTorrent was the only practical method to obtain the datasets in bulk from sources like Anna's Archive, making the upload process an unavoidable component of the download. Meta characterized this uploading as "part-and-parcel" of acquiring the training data necessary for its transformative AI development.
The plaintiffs' legal team has challenged the timing of Meta's new defense, filing a letter with Judge Vince Chhabria claiming the company made an improper late submission after discovery deadlines had passed. They argue that Meta was aware of the uploading claims since November 2024 but failed to raise this fair use defense previously, even when the court specifically asked about it. Meta's attorneys countered that the fair use argument for direct copyright infringement was explicitly flagged in the parties' joint case management statement from December 2024.
The case represents a significant test of how copyright law applies to AI training data acquisition, particularly when companies rely on pirated materials accessed through peer-to-peer networks. Meta has emphasized that obtaining this data helped establish U.S. global leadership in artificial intelligence, framing the issue as one of national technological competitiveness rather than simple copyright infringement.
- The case could set important precedent for how copyright law applies to AI companies' acquisition of training data from pirated sources



