Robot hand is touching human hand

Insights

Cloudflare vs AI Scrapers: Content Reclaims Its Value

In July 2025, Cloudflare announced something that grabbed headlines, but its full impact is only just beginning to surface.  

From that point on, any website protected by Cloudflare will block AI bots from scraping its content unless the site owner explicitly said otherwise.

It might not have sounded that dramatic at the time, especially amid a blur of AI news stories we see almost every day, but this move quietly upended one of the web’s unspoken rules that anything online was fair game for machines to read, learn from, and regurgitate.

This wasn’t just a technical tweak. It was the internet taking a breath and saying: “Hang on, who’s actually getting paid here?”

Cloudflare’s new policy doesn’t just block, it lets creators choose who gets access and even charge for it so it’s more like a gate with pricing options than a wall. It's also a fundamental power shift, and it carries serious implications for the future of online content, the user experience, and how we value the people behind the things we read, watch, and learn from.

Let’s break down what’s really changed, how this affects both users and content creators and why this could be the beginning of a very different internet.

The Era of Unpaid Data Is Ending 

Until recently, AI developers had free rein to train models on publicly available data, whether that’s blogs, wikis, forums, newspapers and social platforms with little resistance. Web crawlers, often disguised or ignored, quietly hoovered up content in the background.

But this model breaks the fundamental economic balance that underpinned the internet. Content creators allowed Google and other search engines to index their content in exchange for referral traffic. In fact, Google became the dominant search engine by manually indexing all the educational content that universities had for free, a smart move that played into the idealistic spirit of the early internet, open access, shared knowledge, mutual benefit. That traffic became page views, which became ad revenue, donations, subscriptions, and therefore, sustainability.

AI has changed the rules. Tools like ChatGPT, Claude, Perplexity, and Google’s Search Generative Experience (SGE) now present the answer directly, bypassing the site. The result being a dramatic collapse in referral traffic. One analysis showed ChatGPT and Perplexity send 96% less traffic than Google Search. Another found that websites could lose up to 60% of their clicks when Google includes an AI summary.

From a user’s perspective, this is efficient! The answer is instant, concise, and usually good enough. But from a publisher’s view, it takes away the incentive to generate content.

Cloudflare’s response: block by default, charge if you want to play.

Hand of the person is pressing the keypad of the laptop, searching of the information with an open book.

Wasn’t This Already Possible? 

Technically, yes. Website owners could already block unwanted bots via robots.txt. But this was manual, obscure, and easy to circumvent. Not all bots obeyed, and some disguised themselves as human users. But Cloudflare now enforces this through their network infrastructure with added layers of authentication, bot detection, and crucially, the ability to charge.

It’s also flipped the default which is hugely significant. Before, you had to opt out of scraping. Now, you have to opt in. And that’s no small detail because its subtle inversion shifts the power back to content creators.

A Value Shift for Online Content 

This is, fundamentally, about value and I think charging for data is a good thing. It gives data value, and that drives incentive to create it. Cloudflare has created the first viable framework for attaching economic worth to online content in the AI age. The shift from open access to conditional access forces a new question into the ecosystem: how much is this answer worth to a machine?

From a creator’s perspective, this is long overdue. But it’s not just about publishers clawing back revenue. As a user, I want AI to give me helpful, synthesised answers but I also believe the creators who made those source materials deserve a share. The price I pay for my AI assistant should include some love for the people whose work made it possible.

Cloudflare’s approach introduces a trustless mechanism that doesn’t rely on the AI company playing nice. It builds value transfer into the plumbing of the web and that has the potential to create a genuinely rich content economy.

I heard a compelling argument against this change, and there’s a philosophical contradiction at play that we should address more openly.

All human creativity builds on what came before. Every writer, designer, or developer has absorbed, reworked, and drawn from someone else’s work. We read, we watch, we are inspired. But we don’t usually pay our idols every time we remix their ideas.

It’s a good point but the problem with AI is scale and memory. It does this too, but it learns at exponential speed, retains everything, and sometimes regurgitates instead of innovating its own take. The line between ‘inspired by’ and ‘copied from’ becomes harder to define, and that’s the grey area we now have to design around.

Cloudflare platform structure and graphical diagram.

Who Opts Out, Who Opts In? 

We’ll likely see polarisation based on the websites business model, but for those who use Cloudflare but aren’t aware of this change, they will automatically be opted out and that’s where the flip on that default becomes powerful.

News sites, educational publishers, and forums are most likely to opt out or charge. These sites rely on attention, donations, or subscriptions, all of which suffer when users never visit the site.

Marketing blogs, portfolios, product pages are more likely to opt in. For them, visibility is the priority. Being included in AI summaries might be more valuable than chasing ad clicks.

I imagine it could evolve into tiered access where some allow brief summaries but charge for deeper analysis. Others may restrict real-time data while allowing evergreen content. We may even see new "AI-content feeds" created just for AI models’ consumption.

In this sense, Cloudflare’s update gives creators the ability to experiment with pricing, licensing, and visibility.

But What About SEO? 

The fear is valid, blocking the wrong crawler might make you invisible.

Importantly, Cloudflare does not block traditional search engine crawlers like Googlebot or Bingbot by default. These remain essential for SEO. The blocked bots are typically those used for AI model training or generative answers.

However, AI answer crawlers are increasingly important. If you block Google's AI summary bot, you might still appear in regular search but not in the AI answer box. That might cut you off from visibility in the fastest-growing segment of search. Google is releasing AI mode, where the traditional SERP is no longer used. If most Google users start using that feature, you may get no traffic at all.

So, the decision becomes strategic: do you want to be part of the AI answer? If so, you must allow access but possibly negotiate terms or insist on compensation.

How Might AI Overviews Adapt? 

If more sites block or charge AI crawlers, AI-generated answers will need to become more humble to keep creating up to date summaries and bringing value to end users. Rather than acting like a definitive authority, AI overviews might evolve into curated guides, surfacing highlights and actively promoting the original sources.

Some possible formats we could see evolve:

  • Attribution-first summaries: “According to the BBC, the cause of X is likely Y. Read more →
  • Rich snippet bundles: Pulling short quotes from 2–3 sources with a carousel format.
  • Read More” CTAs built into answers: e.g. “This summary includes insights from X and Y. Want the full articles?
  • Teaser-based UX: AI reveals just enough to be useful but entices deeper reading, much like Google’s featured snippets.

Over time, we might also see an AI loyalty model. If a user consistently interacts with sources from a specific creator or publication, the AI might prioritise those, building a relationship between reader, source, and assistant.

And if micropayments become more widespread, your AI could even tip creators or pay to unlock enhanced context. We could see pay-per-insight browsing become the new pay-per-click.

This is speculative, but likely because AI needs data. But to keep getting it, it will need to attribute value and send users back to where the wisdom came from.

Real-World Impacts 

Here is a list of how this new shift will impact different parties:

For Users

  • AI assistants may become slightly less comprehensive.
  • More nudges to "read the full article" or "visit the source."
  • Fewer hallucinations (if models are using licensed, recent data).

For Content Creators

  • More control. More leverage. Maybe even revenue.
  • Need to learn new strategies: AIO, crawler permissions, marketplace pricing.
  • Decisions to make: block, allow, charge, or license?

For AI Companies

  • Pressure to license more data.
  • Possible increase in costs per query.
  • Higher demand for transparency and attribution.

For the Web as a Whole

  • Beginning of a two-tier system: data that’s open vs data that’s gated.
  • Risk of fragmentation, but also potential for fairer redistribution.
  • Rebirth of the 402-status code: “Payment Required” becomes real.
International students in the library learning with open books and laptop.

A New Social Contract for the Web 

The internet grew from openness, but in its maturity, it now demands sustainability and the shift toward paid, permissioned AI access doesn’t kill the dream, it evolves it.

We are acknowledging that even in an age of machines, human knowledge has a cost. If you want machines to feed you truth, you must feed the humans who generate the data they rely on. Not just in exposure, but directly in return.

This model may be imperfect and friction-prone, but it’s a good start. In the end, I still believe AI will become the primary interface for most of our informational needs. But now, we’re finally beginning to ask what that future should cost, and who gets paid.

Let’s not just provide generated answers, let’s credit the people who made them possible.

Curious how we can help boost your digital growth to the next level? 

Tell us your business goals and we'll see how we can create something amazing together.

Menu

opens in new window