But I'm not even sure because GH auth system is all over the place and downright nuts in some places...
e.g a fine grained token with repo access can't curl a tarball with the usual URL, it has to use the /api which makes tooling that constructs URLs from repo names and versions break with no recourse as soon as you disable classic PATs
Microsoft is paying top dollar for MarkMonitor, aren't they supposed to proactively register obvious typos so this kind of thing doesn't happen to their clients?
My guess is that MarkMonitor is mainly used for their brand-relevant domains (microsoft, office 365, github (main site), etc), as opposed to one that a small subset of a small subset of their users of one service will use - I would imagine that microsoft likely owns hundreds of domain names and doesn't pay MarkMonitor to monitor every single one
That bug is incredibly dumb and obvious. There's been a PR to fix it for over a year with no attention.
I bet there's not a dedicated "github domain names" team, it's probably part of some overworked platform or infrastructure team, and there's no chance in hell any email you send to microsoft or github will end up with that team ever.
You won't have anyone to transfer the names to, you'll just be holding them and paying for them forever.
The best thing you can do if you want to fix this is:
1. Don't make typos.
2. Email github and tell them to reserve typosquat domains, and know it will get ignored, or _maybe_ added to a backlog and ignored for at least the next 15 years
3. Don't make typos.
4. Don't use ghcr for anything, and always mirror public ghcr.io packages using a "bot" github account with only permissions to public repositories to minimize blast radius.
Actually, the best bet to get this fixed is to wait for Microsoft to provide "Email Github Copilot support", hope that they hooked it up so the AI is capable of making purchase decisions, and convince it to purchase about 6000 domain names that might be typoes for security reasons.
Assuming you're not distributing container images to a huge number of people, you can just run your own docker registry with a hard-to typo name. It costs hardly anything to do: https://github.com/cloudflare/serverless-registry
Yeah I've been thinking about doing this and I probably will. I just have a tendency to scope creep my own projects and I just decided that maybe I should just use ghcr since it's free.
Arguably, the best thing to do to "fix" the issue is to be an evil hacker, and do bad things with it, causing damage, stealing people's money, causing Microsoft to be liable, which causes them to get sued, so then they're monetarily incentivized to actually fix the problem. Just, uh, donate the money that was stolen to a charity and not be evil about it.
Someone already is "being an evil hacker" i.e. running ghrc.io
Is microsoft liable for people typoing a "docker login" command? Is there any chance of a lawsuit?
The fact that there is already someone exploiting it, and it's a big "meh" kinda proves the point perfectly that it's not really a big enough of a deal for the world to fall into chaos.
What it's funny it's that because tokenization there is a non zero chance a LLM audit may not see anything wrong here, similar to the strawberry problem.
Nah, cr and rc are different tokens and LLMs would have no issues telling them apart. An older model might have trouble explaining that cr and rc are similar and can thus get easily mixed up, but the characters are probably more different to the LLM than they are to us.
Why does it seem companies hate subdomains so much? Why is this not just registary.github.com or something? It's like they are trying to get people to fall for phishing by creating so many random domains.
It’s best security practice to host user-generated content on a separate domain to opt into browsers’ cross-domain security policies. Hence ghcr.io, githubusercontent.com, fbimg.com, etc.
Not a web programmer, so know cross-domain only for hearsay :(
It does not seem to hinder e.g. Google using google.com, youtube.com, gmail.com, and several (many?) others to collect your data. Do you say security and privacy work differently here?
In those cases, the company controls all of the code running on those sites, so it's desirable for them to share data and cookies in particular. (e.g. any google.com site can read your login cookie)
In the case of user data domains, intentionally in the design of the service or via a security hole, users may be able to execute code and read cookies (e.g. in JavaScript on a page hosted on githubusercontent.com) and that's undesirable.
Sure, I see why as a company you don't want user data in your domain.
But if the different domain name gives good protection / isolation, why does Google still use completely different domains for different services with content controlled by them. I cannot believe they are interested in protecting users from data collection.
YouTube was an acquisition that they didn’t rebrand. Google Video was on google.com. gmail.com redirects to mail.google.com, and only email addresses use the gmail domain to avoid appearing to be google employee emails.
Interestingly, the GitHub doco says outright that it superseded docker.pkg.github.com. ; so it was a conscious choice to go with this domain naming scheme instead of that one.
I've noticed this too. Why does amazon have aboutamazon.com and Google have developers.googleblog.com? They literally have their own .google TLD but still choose this weird domain.
Same with local governments. They love something really random like <countyname>proptaxpayment.org instead of treasurer.<countyname>.gov. It's exactly the kind of domain you are told to watch out for, but actually legit.
A common scenario I've seen in the case of local governments is that a department (e.g. the Assessing Department) contracts with a vendor to run the website and has no idea how DNS works, and the vendor defaults to registering new domains for their clients since that's the easiest when dealing with non-technical clients. Texas alone for example has 254 countries, the vast majority of which are very small and have effectively no full time IT department, so when these vendors are engaging new clients, low IT expertise is the norm by volume.
The local government itself may have an IT department, but they may not know how to create a subdomain, or even be aware this contract is being made and the site is being set up until after it's announced to the public.
JFTR, I also think they could at least have used a couple of pronouncable domains, or put stuff under a .github.io domain, or at least make it githubrepo.com or something not acronym-y
One reason why you should never think or say ghcr, but always github container register, even if that is longer. You should have enough time for not getting trapped.
Root cause a stupid FLA of course. For several months I thought it means Google whatever register.
Does OAuth reuse tokens across domains? If not, doesn't this just mean it is requesting an auth token for ghrc (the "fake" domain) but it can't access any auth tokens for ghcr (the real domain)?
Blog author (and OCI maintainer) here. The request to get a bearer token sends the password or PAT using the basic auth header, base64 encoded, but otherwise clear-text. That's the request the www-authenticate header is triggering. Once the token is received, the registry uses that to verify access, and that eventually expires. But the attacker isn't getting the token, they are requesting the credentials that would be used to acquire a bearer auth token.
Reminder not to use goofy TLDs, being cute is not worth it when compared to security. There's no guarantees that the process for taking down a malicious domain will be as smooth as a .com.
I'd rather deal with US verisign rather than the British Indian Ocean territory or colombia or anguila
GitHub Container registry does not even support fine-grained tokens, instead it uses classic ones [1], which makes this even more dangerous.
[1] https://docs.github.com/en/packages/working-with-a-github-pa...
Edit: most relevant issues?
https://github.com/orgs/community/discussions/38467
https://github.com/github/roadmap/issues/558
Are there any additional mitigations folks are using for this? This issue is the only reason we can’t turn classic PATs off entirely.
Short lifetime mandatory reauth to enterprise SSO seems to be the best available, but it’s inconvenient for the single Classic PAT we actually need.
Maybe:
- create a GitHub App or something that can generate transient tokens
- implement some CLI that generates a token
- login with that token
- push
See e.g: https://medium.com/@tiwari09abhi/github-app-token-authorizat... https://martin.baillie.id/wrote/ephemeral-github-tokens-via-...
But I'm not even sure because GH auth system is all over the place and downright nuts in some places...
e.g a fine grained token with repo access can't curl a tarball with the usual URL, it has to use the /api which makes tooling that constructs URLs from repo names and versions break with no recourse as soon as you disable classic PATs
Someone near a computer that is feeling generous should buy up all the typo'd domain names and hand them over to Microsoft.
Microsoft should rename the registry. This is a horrible name. I know I've typo'd it before.
Microsoft is paying top dollar for MarkMonitor, aren't they supposed to proactively register obvious typos so this kind of thing doesn't happen to their clients?
My guess is that MarkMonitor is mainly used for their brand-relevant domains (microsoft, office 365, github (main site), etc), as opposed to one that a small subset of a small subset of their users of one service will use - I would imagine that microsoft likely owns hundreds of domain names and doesn't pay MarkMonitor to monitor every single one
Why do they even need 1420 domain names for one service?
What's wrong with registry.github.com, pages.github.com etc etc?
Too much to type?
It may be easier to register a new domain than to get people to make a subdomain for you.
Isn't that an official MS service for github?
Yeah, and what I'm saying is that it may be hard to get people within your org to do something for you.
Good luck with that.
People over in this github-actions issue are struggling to get github's attention for a 1-line fix to stop hanging jobs forever https://github.com/actions/runner/issues/3792#issuecomment-3...
That bug is incredibly dumb and obvious. There's been a PR to fix it for over a year with no attention.
I bet there's not a dedicated "github domain names" team, it's probably part of some overworked platform or infrastructure team, and there's no chance in hell any email you send to microsoft or github will end up with that team ever.
You won't have anyone to transfer the names to, you'll just be holding them and paying for them forever.
The best thing you can do if you want to fix this is:
1. Don't make typos.
2. Email github and tell them to reserve typosquat domains, and know it will get ignored, or _maybe_ added to a backlog and ignored for at least the next 15 years
3. Don't make typos.
4. Don't use ghcr for anything, and always mirror public ghcr.io packages using a "bot" github account with only permissions to public repositories to minimize blast radius.
Actually, the best bet to get this fixed is to wait for Microsoft to provide "Email Github Copilot support", hope that they hooked it up so the AI is capable of making purchase decisions, and convince it to purchase about 6000 domain names that might be typoes for security reasons.
Apparently fixed five days ago: https://github.com/actions/runner/pull/3157
But yes a joke of a situation.
> Don't use ghcr for anything
What is the alternative for small budget private code projects?
Assuming you're not distributing container images to a huge number of people, you can just run your own docker registry with a hard-to typo name. It costs hardly anything to do: https://github.com/cloudflare/serverless-registry
Yeah I've been thinking about doing this and I probably will. I just have a tendency to scope creep my own projects and I just decided that maybe I should just use ghcr since it's free.
Arguably, the best thing to do to "fix" the issue is to be an evil hacker, and do bad things with it, causing damage, stealing people's money, causing Microsoft to be liable, which causes them to get sued, so then they're monetarily incentivized to actually fix the problem. Just, uh, donate the money that was stolen to a charity and not be evil about it.
Someone already is "being an evil hacker" i.e. running ghrc.io
Is microsoft liable for people typoing a "docker login" command? Is there any chance of a lawsuit?
The fact that there is already someone exploiting it, and it's a big "meh" kinda proves the point perfectly that it's not really a big enough of a deal for the world to fall into chaos.
Fairly compelling attack vector because it took several readings for me to even see the problem with the domain.
You and many others. Including people who retry multiple times, and even reboot their machines.
* https://stackoverflow.com/a/66985424/340790 (Spot the answerer's account name!)
* https://forums.docker.com/t/docker-unable-to-push-to-ghrc-io...
https://github.com/search?q=ghrc.io&type=code
didn't think so many projects will have this kind of mistake!
Yikes!
Thank you for this.
Took the article pointing out that the c and r were transposed for me to even notice there was a problem!
Yep this is the sort of typo error I make probably 10 times a day.
What it's funny it's that because tokenization there is a non zero chance a LLM audit may not see anything wrong here, similar to the strawberry problem.
Nah, cr and rc are different tokens and LLMs would have no issues telling them apart. An older model might have trouble explaining that cr and rc are similar and can thus get easily mixed up, but the characters are probably more different to the LLM than they are to us.
The problem here is GitHub's terrible domain name.
The container registry has a horrible name.
Why does it seem companies hate subdomains so much? Why is this not just registary.github.com or something? It's like they are trying to get people to fall for phishing by creating so many random domains.
It’s best security practice to host user-generated content on a separate domain to opt into browsers’ cross-domain security policies. Hence ghcr.io, githubusercontent.com, fbimg.com, etc.
https://www.reddit.com/r/webdev/comments/lg9xnm/why_do_some_...
Not a web programmer, so know cross-domain only for hearsay :(
It does not seem to hinder e.g. Google using google.com, youtube.com, gmail.com, and several (many?) others to collect your data. Do you say security and privacy work differently here?
In those cases, the company controls all of the code running on those sites, so it's desirable for them to share data and cookies in particular. (e.g. any google.com site can read your login cookie)
In the case of user data domains, intentionally in the design of the service or via a security hole, users may be able to execute code and read cookies (e.g. in JavaScript on a page hosted on githubusercontent.com) and that's undesirable.
Sure, I see why as a company you don't want user data in your domain.
But if the different domain name gives good protection / isolation, why does Google still use completely different domains for different services with content controlled by them. I cannot believe they are interested in protecting users from data collection.
YouTube was an acquisition that they didn’t rebrand. Google Video was on google.com. gmail.com redirects to mail.google.com, and only email addresses use the gmail domain to avoid appearing to be google employee emails.
Interestingly, the GitHub doco says outright that it superseded docker.pkg.github.com. ; so it was a conscious choice to go with this domain naming scheme instead of that one.
* https://docs.github.com/en/packages/working-with-a-github-pa...
I've noticed this too. Why does amazon have aboutamazon.com and Google have developers.googleblog.com? They literally have their own .google TLD but still choose this weird domain.
Same with local governments. They love something really random like <countyname>proptaxpayment.org instead of treasurer.<countyname>.gov. It's exactly the kind of domain you are told to watch out for, but actually legit.
A common scenario I've seen in the case of local governments is that a department (e.g. the Assessing Department) contracts with a vendor to run the website and has no idea how DNS works, and the vendor defaults to registering new domains for their clients since that's the easiest when dealing with non-technical clients. Texas alone for example has 254 countries, the vast majority of which are very small and have effectively no full time IT department, so when these vendors are engaging new clients, low IT expertise is the norm by volume.
The local government itself may have an IT department, but they may not know how to create a subdomain, or even be aware this contract is being made and the site is being set up until after it's announced to the public.
Now you too are hearing a voice in your head, as I did, in the classic drawl, saying "Counties, kid. Texas ain't that big.". (-:
If you are very old[tm] you might remember that github pages were hosted on USER.github.com and they moved to USER.github.io in 2013, https://github.blog/news-insights/product-news/new-github-pa...
JFTR, I also think they could at least have used a couple of pronouncable domains, or put stuff under a .github.io domain, or at least make it githubrepo.com or something not acronym-y
Probably, it's cool, and honored inside an org to operate a separate domain service vs go ask for a permission for a subdomain to another team.
insecurity through obscurity
One reason why you should never think or say ghcr, but always github container register, even if that is longer. You should have enough time for not getting trapped.
Root cause a stupid FLA of course. For several months I thought it means Google whatever register.
One reason why you should never think or say [or write] FLA, but always Four Letter Acronym (probably?), even if that is longer.
I couldn't find anything useful - what is a FLA?
FLA is an unusual way of writing XTLA (Extended Three Letter Acronym).
Four Letter Acronym probably. https://slang.net/meaning/fla
Previously on Hacker News at https://news.ycombinator.com/item?id=44974240 .
whois says it's registered by dynadot, so it's probably worth contacting their abuse email: abuse@dynadot.com
There are alot of open source projects using this domain https://github.com/search?q=ghrc.io&type=code
GitHub should a have tool internally to create bulk and send it as a fix
https://github.com/advanced-security/secret-scanning-custom-...
they probably do, they already have one that identified credentials posted to github repos by accident.
That's a fairly impressively sized list.
Is the danger here token replay? It's using Bearer tokens, so it's not sending a password over:
<https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Aut...>
Threats section for Bearer tokens: <https://datatracker.ietf.org/doc/html/rfc6750#section-5.2>
Does OAuth reuse tokens across domains? If not, doesn't this just mean it is requesting an auth token for ghrc (the "fake" domain) but it can't access any auth tokens for ghcr (the real domain)?
Blog author (and OCI maintainer) here. The request to get a bearer token sends the password or PAT using the basic auth header, base64 encoded, but otherwise clear-text. That's the request the www-authenticate header is triggering. Once the token is received, the registry uses that to verify access, and that eventually expires. But the attacker isn't getting the token, they are requesting the credentials that would be used to acquire a bearer auth token.
I don't get it what is ghrc and why does it matter
Damn, this can pick a typo from a CI job and do mean things.
Reminder not to use goofy TLDs, being cute is not worth it when compared to security. There's no guarantees that the process for taking down a malicious domain will be as smooth as a .com.
I'd rather deal with US verisign rather than the British Indian Ocean territory or colombia or anguila
The .io TLD is administered by Afilias which is an American corporation.
Afilias was sold to Ethos Capital and the whole domain is a mess:
https://en.m.wikipedia.org/wiki/.io
Wouldn't DNSSEC solve stuff like this?
How?