FSF Calls for Freedom in Anthropic Settlement: Demands Open Training Data and Model Weights
Key Takeaways
- ▸The FSF identified its copyrighted book in Anthropic's training datasets, prompting a statement on the broader copyright and licensing issues in LLM development
- ▸The FSF is leveraging the settlement to advocate for open-source transparency in AI model development, including complete training data and model weights
- ▸The dispute highlights the tension between fair use rights (affirmed by the court for model training) and licensing obligations for dataset acquisition
Summary
The Free Software Foundation (FSF) has responded to a copyright infringement settlement notice in the class action lawsuit Bartz v. Anthropic, which alleges that Anthropic unlawfully downloaded copyrighted works from Library Genesis and Pirate Library Mirror datasets to train its large language models. While a district court previously ruled that using the books for LLM training constituted fair use, the settlement addresses the legality of the initial download process. The FSF notes that its own copyrighted work, "Free as in Freedom: Richard Stallman's Crusade for Free Software," was included in Anthropic's training datasets without proper licensing consideration.
Rather than pursuing monetary compensation, the FSF is urging Anthropic and other LLM developers to adopt a freedom-focused approach by open-sourcing their training inputs, model weights, training configurations, and software source code alongside their models. The FSF emphasizes that since many copyrighted works are published under free licenses—including its own GNU Free Documentation License—the ethical path forward is to provide complete transparency and user freedom. The foundation stated that if it were to litigate such cases, it would seek user freedom as compensation rather than financial damages.
- The FSF's position prioritizes computing freedom and user rights over monetary settlements in copyright disputes
Editorial Opinion
The FSF's statement represents a principled stance that could reshape industry practices if adopted widely. By demanding open-source transparency rather than financial settlements, the foundation highlights a crucial distinction: fair use for training does not absolve companies of their obligation to respect free software licenses and user freedom. However, this position may face resistance from commercial AI labs reluctant to release proprietary training data and weights, raising questions about whether freedom-based settlements can become industry standard or remain aspirational.



