
Cloudflare, a connectivity cloud company, has launched a new marketplace that reimagines the relationship between website owners and AI companies, offering publishers more control over their content.
The web giant revealed that new tools allow site owners to automate their robots.txt file and limit AI bot access, particularly to ad-monetised pages.
What is robots.txt?
A robots.txt file is a plaintext document that tells web crawlers which areas of a site they’re permitted to visit and which sections they should avoid.
It is worth noting that Cloudflare released these new tools in response to growing concerns that AI-powered bots are harvesting website content without generating substantial referral traffic to the original publishers.
However, a new wave of AI-focused crawlers, from companies like OpenAI and Anthropic, have been scraping data online to train their models, often without sending visitor traffic in return.
Cloudflare’s analysis shows that OpenAI’s GPTBot made roughly 1,700 content requests for each referral it generated, while Anthropic’s ClaudeBot exhibited an even more extreme ratio of 73,000:1.
In response to the changing landscape, Cloudflare launched a managed robots.txt feature that automatically creates or modifies a site’s file to signal AI crawlers such as Google-Extended and Applebot-Extended to prevent using the content for training purposes.
The tool preserves any existing rules and remains SEO-friendly, making it simple for users, even without technical expertise, to express how they want their site data treated.
With the new tools, the company will now prompt all new customers to enable its managed robots.txt by default. Both new tools are free and available to all Cloudflare users.