RSS Co-Creator Launches 'Really Simple Licensing' Standard for AI Web Crawling
Key Takeaways
- ▸RSL Collective launched Really Simple Licensing (RSL), an open standard allowing publishers to set licensing terms for AI web crawlers through robots.txt files
- ▸The initiative is co-founded by RSS co-creator Eckart Walther and former Ask.com CEO Doug Leeds, with Reddit, Yahoo, and Medium as launch partners
- ▸RSL addresses growing concerns about AI companies using web-scraped content for training without compensation or permission from content creators
Summary
The RSL Collective, a nonprofit co-founded by RSS co-creator Eckart Walther and former Ask.com CEO Doug Leeds, has launched Really Simple Licensing (RSL), an open content licensing standard designed to allow web publishers to set terms for AI companies crawling their content for training data. Announced on September 10, 2025, the standard enables publishers to specify licensing requirements directly in their robots.txt files, potentially creating a framework for compensating content creators whose work is used to train generative AI models.
Major web platforms including Reddit, Yahoo, and Medium are among the initial participants adopting the RSL standard at launch. The initiative comes amid growing tension between content publishers and AI companies over the use of web-scraped data for training large language models without compensation or permission. By extending the existing robots.txt protocol—which has traditionally been used to control search engine crawling—RSL aims to establish clear commercial terms for AI data usage.
The standard represents an attempt to create industry-wide norms around AI training data licensing, potentially shifting the paradigm from unrestricted web scraping to permission-based or compensated data collection. As generative AI companies continue to require massive datasets for model training, RSL could become a critical piece of infrastructure in the evolving relationship between content creators and AI developers.
- The standard extends the existing robots.txt protocol to include commercial licensing terms specifically for generative AI training data collection
Editorial Opinion
The launch of RSL represents a potentially pivotal moment in establishing norms around AI training data rights, though its success depends entirely on whether major AI companies choose to respect these licensing terms voluntarily. Unlike traditional robots.txt restrictions for search engines—which companies generally honor to maintain good relationships—AI firms may calculate that the competitive advantage of accessing training data outweighs reputational concerns. The standard's real test will come when publishers attempt to enforce these terms, potentially setting up legal battles that could define whether web content is a commons or protected intellectual property in the AI age.



