I disagree with the other top-level comments at the moment: I believe Web Bot Auth is a useful and non-centralized emerging standard for self-identifying bots and agents.
This press release today is a better statement of _why_ this feature exists (as opposed to the submission link, which is nuts-and-bolts of implementing): https://blog.cloudflare.com/signed-agents/
Web Bot Auth is a way for bots to self-identify cryptographically. Unlike the user agent header (which is trivially spoofed) or known IPs (painful to manage), Web Bot Auth uses HTTP Message Signatures using the bot's key, which should be published at some well-known location.
This is a good thing! We want bots to be able to self-identify in a way that can't be impersonated. This gives website operators the power to allow or deny well-behaved bots with precision. It doesn't change anything about bots who try to hide their identity, who are not going to self-identify anyways.
I think about failure modes. What happens if cloudflare decides you are a bot and you’re not. What recourse do you have? What are the formal mechanisms to ensure a person is not blocked from the majority of the web because cloudflare is a middleman and you are a false positive?
The problem with this is that key generation is free, so being a well-behaved unknown bot is the same as being an unidentified bot, which means that you go in the block/captcha/throttle bucket.
It is only useful for whitelisting bots, not for banning bad ones, as bad ones can rotate keys.
Whitelisting clients by identity is the death of the open web, and means that nobody will ever be able to compete with capital on even footing.
As much as I understand this is needed it rubs me the wrong way.
The standard looks fine as a distributed protocol until you have to register to pay a rent to Cloudflare, which they say will eventually trickle down into publishers pocket but you know what having a middleman this powerful means to the power dynamics of the market. Publishers have a really bad hand no matter what we do to save them, content as we know it will have to adapt.
Give it a couple more iterations and some MBA will come up with the brilliant idea of introducing an internet toll to humans and selling a content bundle with unlimited access to websites.
Cloudflare is only the first to market with a solution. If this proposal catches on every WAF vendor under the sun will have it implemented before the next sales cycle. Enforcement of this standard will be commoditized down to nothing.
There is just too much spam and it's not clear that is a solvable problem without Cloudflare (or some other similar service). Maybe if they get big enough the incentives to spam will vanish and non Cloudflare sites can exist in peace (at-least until enough people leave Cloudflare that spam become profitable again).
Why use a "web bot" instead of an API? Either can be driven by an AI "agent"...but this just seems like an "API key for a visual api interface", and rather wasteful in cost and resources. If a company could afford to pay a partner for an API key they wouldn't need this. If they can't afford to pay the partner for access -- they'd still be blocked with or without "Web Bot Auth". I don't understand what this is for.
I suspect I'm missing something, what am I missing?
if you already have an api that exposes all the information that your parter who is willing to pay for an API key wants, then sure, that's perfect. but what if you don't have an API, or your API doesn't expose the information that crawlers are looking for? they want to crawl your website, they're willing to pay for the ability to crawl your website, but you don't want to build an API...
i'm sure the next step here will be a cloudflare product that sits in front of your website and blocks all bot traffic except for the bots that are verified to have paid for access. (or maybe that already exists?)
Part of it, at least, is people thinking they've solved some perceived problem and being told by their chatbot that it's a terrific, brilliant new innovation and they should build a whole new protocol spec for it.
Web Bot Auth solves authentication (“who is this bot?”) but not authorization/usage control. We still need a machine-readable policy layer so sites can express “what this bot may do, under which terms” (purpose limits, retention, attribution, optional pricing) at a well-known path, robots.txt-like, but enforceable via signatures.
A practical flow:
1. Bot self-identifies (Web Bot Auth)
2. Fetch policy
3. Accept terms or negotiate (HTTP 402 exists)
4. Present a signed receipt proving consent/payment
5. Origin/CDN verifies receipt and grants access
That keeps things decentralized: identity is transport; policy stays with the site; receipts provide auditability, no single gatekeeper required. There’s ongoing work in this direction (e.g., PEAC using /.well-known/peac.txt) that aims to pair Web Bot Auth with site-controlled terms and verifiable receipts.
Disclosure: I work on PEAC, but the pattern applies regardless of implementation.
I like that parsable signature in the http message however I dont quite understand how the system differentiates between human users and an llm agent controlling a browser
Cloudflare's verified bots program is a terrible idea. They want to be the central chokepoint for agents, and they're doing it in shady ways like auto enrolling customers into blocking agents.
It's discriminatory against robots and helps make the web even more locked down. DRM never works; the analog hole is always the nuclear option.
In the end, only people with non-mainstream browsers (or using VPN to escape country-level blocks, or Tor, or noJS) suffer.
It's like how anti-piracy measures only affect paying customers, while pirates ironically get a better experience. The best way to get around endless CAPTCHAs is to just use LLMs instead.
Cloudflare is the last party that should be running this for two reasons.
1. THey have already proven to be a bad faith actor with their "DDoS protection."
2. This is pretty much the typical Cloudflare HN playbook. They release soemthing targeted at the current wave and hide behind an ideological barrier; meanwhile if you try to use them for anything serious they require a call with sales who jumps you with absurdly high pricing.
Do other cloud providers charge high fees for things they have no business charging for? Absolutely. But they typically tell you upfront and don't run ideological narratives.
This is not a company we should be putting much trust in, especially not with their continued plays to become the gatekeepers of the internet.
1) how so? Pretty much everything they do for DDoS protection is at their customers choice. You might not like what people want for their site but lets not pretend that most companies aren't very happy with it.
2) Then don't use them? Either they provide enough value to pay them or they don't.
There is a whole segment of tech designed around helping you understand and manage cloud costs, through consultations, automations, etc. It has spawned companies and career paths!
Ime, cloud cost centres are intentionally confusing and annoying. I get emails telling me to check their dashboard for billing info which I inevitably never do. It’s designed that way.
Disagree. Not everybody wants their sites scraped and their content used to train a model that they'll never see a penny from. Cloudflare is the only party who wants to build a system where both the models and individual sites have their interests respected.
I will tell you that we have had bot super fight mode on for a year and since then we have not had to address abusing traffic nor deal with legitimate people blocked. There is no way we could have achieved such balance. prior to that it was me blocking every Chinese AS under the sun as they shifted and bombarded us with traffic
Your browser is configured to disable cookies. Anubis requires cookies for the legitimate interest of making sure you are a valid client. Please enable cookies for this domain.
Thing is, my browser isn’t configured that way. So works well, I guess.
I have not disabled cookies. Cloudflare works fine. Users being able to access a website is a pretty important metric when considering which is ‘better’.
The internet was designed to work the way it does for good reasons.
You not understanding those reasons is not an excuse for allowing a giant tech company to step in and be the gatekeeper for a huge portion of the internet. Nor to monetize, enshittify, balkanize, and fragment the web with no effective recourse or oversight.
Cloudflare shouldn't be allowed to operate, in my view.
> You not understanding those reasons is not an excuse for allowing a giant tech company to step in and be the gatekeeper for a huge portion of the internet.
Are you somehow under the impression that Cloudflare is forcing their service on other companies? They’re not stepping in, the people who own those sites have decided paying them is a better deal than building their own alternatives.
They did exactly that, they just outsourced it to cloudflare. The problem became bad enough that a lot of other people did the same thing.
If your argument is "companies shouldn't be allowed to outsource components to other companies, or cloudflare specifically", then sure, but good luck ever enforcing that.
> screw their wedging themselves between web operators and web users
Web operators choose to use them; hell they even pay Cloudflare to be between them. Seriously I just think you don't understand how bad it is to run a site without someone in-front of it.
I run a site that is a primary source of information. We also have customers that subscribe and are very sensitive to heavy handed controls. Before cloudflare and after "AI" we had bots from all over just destroying our endpoints with bursts of mining traffic. While we would love to have more discoverability this is not that. Cloudflare is in a tough spot trying to arbitrate good traffic vs bad. From my experience they are doing this as good as one can.
Couldn’t agree more — Much like running my own DNS or email server, I don’t think I’ll ever go back to running my own website directly on the internet. It’s just not worth the hassle. For stuff only I use, it sits behind my VPN. For anything that _must_ be public, it’s going behind a WAF someone else can run.
I miss the 90s, too, but these days anyone who wants to deal with current levels of bot traffic is probably going to look at a service like Cloudflare as much cheaper than the amount of ops time they’d otherwise spend keeping things up and secure.
I disagree with the other top-level comments at the moment: I believe Web Bot Auth is a useful and non-centralized emerging standard for self-identifying bots and agents.
This press release today is a better statement of _why_ this feature exists (as opposed to the submission link, which is nuts-and-bolts of implementing): https://blog.cloudflare.com/signed-agents/
Web Bot Auth is a way for bots to self-identify cryptographically. Unlike the user agent header (which is trivially spoofed) or known IPs (painful to manage), Web Bot Auth uses HTTP Message Signatures using the bot's key, which should be published at some well-known location.
This is a good thing! We want bots to be able to self-identify in a way that can't be impersonated. This gives website operators the power to allow or deny well-behaved bots with precision. It doesn't change anything about bots who try to hide their identity, who are not going to self-identify anyways.
It's worth reading the proposal on the details: https://datatracker.ietf.org/doc/html/draft-meunier-web-bot-... . Nothing about this is limited to Cloudflare.
I'm also working on support for Web Bot Auth for our Agent Identification project at Stytch https://www.isagent.dev . Well-behaved bots benefit from this self-identification because it enables a better Agent Experience: https://stytch.com/blog/introducing-is-agent/
Isn't this somewhat equilivent to ensuring cookies are required?
Obviously this technology is different but the same sort of result.
What's the end game here? All humans end up having to use a unique encryption key to prove their humanness also?
I agree in principle, but I disagree that it should be designed and mandated by a private gatekeeper
What's now at the top has links to IETF drafts in the first paragraph. What am I missing?
A way to authenticate identity for crawlers so I can allow-list ones I want to get in, exempt them from turnstile/captcha, etc -- is something I need.
I'm not following what makes this controversial. Cryptographic verification of identity for web requests, sounds right.
I think about failure modes. What happens if cloudflare decides you are a bot and you’re not. What recourse do you have? What are the formal mechanisms to ensure a person is not blocked from the majority of the web because cloudflare is a middleman and you are a false positive?
Isn't that how most web standards got their start? One of the interested parties pushed something, then things evolved through the standards process?
(And then it can of course get derailed, but that's a separate story)
The problem with this is that key generation is free, so being a well-behaved unknown bot is the same as being an unidentified bot, which means that you go in the block/captcha/throttle bucket.
It is only useful for whitelisting bots, not for banning bad ones, as bad ones can rotate keys.
Whitelisting clients by identity is the death of the open web, and means that nobody will ever be able to compete with capital on even footing.
As much as I understand this is needed it rubs me the wrong way.
The standard looks fine as a distributed protocol until you have to register to pay a rent to Cloudflare, which they say will eventually trickle down into publishers pocket but you know what having a middleman this powerful means to the power dynamics of the market. Publishers have a really bad hand no matter what we do to save them, content as we know it will have to adapt.
Give it a couple more iterations and some MBA will come up with the brilliant idea of introducing an internet toll to humans and selling a content bundle with unlimited access to websites.
Cloudflare is only the first to market with a solution. If this proposal catches on every WAF vendor under the sun will have it implemented before the next sales cycle. Enforcement of this standard will be commoditized down to nothing.
There is just too much spam and it's not clear that is a solvable problem without Cloudflare (or some other similar service). Maybe if they get big enough the incentives to spam will vanish and non Cloudflare sites can exist in peace (at-least until enough people leave Cloudflare that spam become profitable again).
Why use a "web bot" instead of an API? Either can be driven by an AI "agent"...but this just seems like an "API key for a visual api interface", and rather wasteful in cost and resources. If a company could afford to pay a partner for an API key they wouldn't need this. If they can't afford to pay the partner for access -- they'd still be blocked with or without "Web Bot Auth". I don't understand what this is for.
I suspect I'm missing something, what am I missing?
The website the human sees is the new API.
That's needed because many APIs are either nonexistent or extremely marginal in design and content coverage.
if you already have an api that exposes all the information that your parter who is willing to pay for an API key wants, then sure, that's perfect. but what if you don't have an API, or your API doesn't expose the information that crawlers are looking for? they want to crawl your website, they're willing to pay for the ability to crawl your website, but you don't want to build an API...
i'm sure the next step here will be a cloudflare product that sits in front of your website and blocks all bot traffic except for the bots that are verified to have paid for access. (or maybe that already exists?)
Part of it, at least, is people thinking they've solved some perceived problem and being told by their chatbot that it's a terrific, brilliant new innovation and they should build a whole new protocol spec for it.
Web Bot Auth solves authentication (“who is this bot?”) but not authorization/usage control. We still need a machine-readable policy layer so sites can express “what this bot may do, under which terms” (purpose limits, retention, attribution, optional pricing) at a well-known path, robots.txt-like, but enforceable via signatures.
A practical flow:
1. Bot self-identifies (Web Bot Auth)
2. Fetch policy
3. Accept terms or negotiate (HTTP 402 exists)
4. Present a signed receipt proving consent/payment
5. Origin/CDN verifies receipt and grants access
That keeps things decentralized: identity is transport; policy stays with the site; receipts provide auditability, no single gatekeeper required. There’s ongoing work in this direction (e.g., PEAC using /.well-known/peac.txt) that aims to pair Web Bot Auth with site-controlled terms and verifiable receipts.
Disclosure: I work on PEAC, but the pattern applies regardless of implementation.
I like that parsable signature in the http message however I dont quite understand how the system differentiates between human users and an llm agent controlling a browser
Cloudflare's verified bots program is a terrible idea. They want to be the central chokepoint for agents, and they're doing it in shady ways like auto enrolling customers into blocking agents.
That’s not shady, that’s awesome customer value! Bot blocking as Default option is a great choice for all of us.
It's discriminatory against robots and helps make the web even more locked down. DRM never works; the analog hole is always the nuclear option.
In the end, only people with non-mainstream browsers (or using VPN to escape country-level blocks, or Tor, or noJS) suffer.
It's like how anti-piracy measures only affect paying customers, while pirates ironically get a better experience. The best way to get around endless CAPTCHAs is to just use LLMs instead.
Cloudflare is the last party that should be running this for two reasons.
1. THey have already proven to be a bad faith actor with their "DDoS protection."
2. This is pretty much the typical Cloudflare HN playbook. They release soemthing targeted at the current wave and hide behind an ideological barrier; meanwhile if you try to use them for anything serious they require a call with sales who jumps you with absurdly high pricing.
Do other cloud providers charge high fees for things they have no business charging for? Absolutely. But they typically tell you upfront and don't run ideological narratives.
This is not a company we should be putting much trust in, especially not with their continued plays to become the gatekeepers of the internet.
1) how so? Pretty much everything they do for DDoS protection is at their customers choice. You might not like what people want for their site but lets not pretend that most companies aren't very happy with it.
2) Then don't use them? Either they provide enough value to pay them or they don't.
Have you seen large cloud provider billing?????
There is a whole segment of tech designed around helping you understand and manage cloud costs, through consultations, automations, etc. It has spawned companies and career paths!
Yes but they don’t hide that behind ideological nonsense, they own up to it. They’re a good faith actor with a high price tag
Ime, cloud cost centres are intentionally confusing and annoying. I get emails telling me to check their dashboard for billing info which I inevitably never do. It’s designed that way.
Cloudflare is playing both sides: grok.com is served by Cloudflare.
Seems like Cloudflare wants to regulate the internet.. they should not have that power.
Disagree. Not everybody wants their sites scraped and their content used to train a model that they'll never see a penny from. Cloudflare is the only party who wants to build a system where both the models and individual sites have their interests respected.
Are you sure that CF can stop AI bots?
I will tell you that we have had bot super fight mode on for a year and since then we have not had to address abusing traffic nor deal with legitimate people blocked. There is no way we could have achieved such balance. prior to that it was me blocking every Chinese AS under the sun as they shifted and bombarded us with traffic
> nor deal with legitimate people blocked
How are you so sure of that? Their marketing?
Do you have a better alternative?
Have you looked into open-source alternatives? I'm assuming that it's a pressing problem for you, and you have already explored alternatives.
I have, sadly they are basically worthless and often worse then worthless as they negatively impact the site.
Interesting. Care to list them here so that we all can learn.
https://anubis.techaro.lol/ ?
Your browser is configured to disable cookies. Anubis requires cookies for the legitimate interest of making sure you are a valid client. Please enable cookies for this domain.
Thing is, my browser isn’t configured that way. So works well, I guess.
The target was better than cloudflare, which also demands cookies but with more tracking. This is still better.
I have not disabled cookies. Cloudflare works fine. Users being able to access a website is a pretty important metric when considering which is ‘better’.
Then put up a goddamn login wall.
The internet was designed to work the way it does for good reasons.
You not understanding those reasons is not an excuse for allowing a giant tech company to step in and be the gatekeeper for a huge portion of the internet. Nor to monetize, enshittify, balkanize, and fragment the web with no effective recourse or oversight.
Cloudflare shouldn't be allowed to operate, in my view.
> You not understanding those reasons is not an excuse for allowing a giant tech company to step in and be the gatekeeper for a huge portion of the internet.
Are you somehow under the impression that Cloudflare is forcing their service on other companies? They’re not stepping in, the people who own those sites have decided paying them is a better deal than building their own alternatives.
>Then put up a goddamn login wall.
They did exactly that, they just outsourced it to cloudflare. The problem became bad enough that a lot of other people did the same thing.
If your argument is "companies shouldn't be allowed to outsource components to other companies, or cloudflare specifically", then sure, but good luck ever enforcing that.
URL should be blog post:
The age of agents: cryptographically recognizing agent traffic
https://blog.cloudflare.com/signed-agents/
(https://news.ycombinator.com/item?id=45052276)
No offense, but screw CloudFlare, screw their captchas for humans, and screw their wedging themselves between web operators and web users.
They can offer what they want for bots. But stop ruining the experience for humans first.
> screw their wedging themselves between web operators and web users
Web operators choose to use them; hell they even pay Cloudflare to be between them. Seriously I just think you don't understand how bad it is to run a site without someone in-front of it.
I run a site that is a primary source of information. We also have customers that subscribe and are very sensitive to heavy handed controls. Before cloudflare and after "AI" we had bots from all over just destroying our endpoints with bursts of mining traffic. While we would love to have more discoverability this is not that. Cloudflare is in a tough spot trying to arbitrate good traffic vs bad. From my experience they are doing this as good as one can.
Couldn’t agree more — Much like running my own DNS or email server, I don’t think I’ll ever go back to running my own website directly on the internet. It’s just not worth the hassle. For stuff only I use, it sits behind my VPN. For anything that _must_ be public, it’s going behind a WAF someone else can run.
They don't have to, but they're tricked into doing so. Via marketing.
I miss the 90s, too, but these days anyone who wants to deal with current levels of bot traffic is probably going to look at a service like Cloudflare as much cheaper than the amount of ops time they’d otherwise spend keeping things up and secure.