Text case changes the size of QR codes

(johndcook.com)

174 points | by ibobev 8 days ago ago

54 comments

  • MereInterest 2 days ago ago

    This was something that I paid close attention to when designing a QR code to be hand-carved into a set of coasters. To minimize the amount of detail carving required, I wanted to use the smallest QR code at 21x21 (version 1) tiles.

    With ascii encoding, this would limit me to 17 characters, but the alphanumeric encoding allowed up to 25 characters. Since DNS is case-insensitive, this let me carve a slightly longer URL. The only downside was that it required making a custom redirect on my own website, since I couldn’t find any url shorteners that would use all caps.

    To this day, it is the most effort I’ve put into rick-rolling somebody.

    • jagged-chisel 2 days ago ago

      The cool bit is, due to the redirect, you can change the final destination without any more carving.

      • gregsadetsky 2 days ago ago

        Redirects going anywhere is super flexible, but is also the unfortunate business model of so many "free qr code generator" sites which end up taking your destination link hostage...! (this just reminds me of that, obv the parent post isn't doing that)

        My friend's partner once printed a qr code like that and then had to pay a monthly fee to keep the qr code working. Pure predator behavior.

        • joshstrange a day ago ago

          My sister has run into this before as well. You have to be very careful because 99% of QR generators out there do something like that. I’ve found some that don’t for her to use but I really should just vibe code up a website for her that I know won’t do a bait and switch (obviously they can’t change old “pure” QRs but they could start doing redirects on new ones at any time).

    • fragmede 2 days ago ago

      You're better than me, but you should try ssh funky.nondeterministic.computer sometime.

  • chrismorgan 2 days ago ago

    > Alphanumeric data, in the context of QR codes, comes from the following alphabet of 44 characters:

    > 0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ $%*+-.:

    This is wrong: alphanumeric has 45 characters, not 44. It’s missing the second last character, /.

    (The slash is important because it makes alphanumeric-mode URLs possible: you can write HTTPS://EXAMPLE.COM/PATH which will be parsed to https://example.com/PATH. No query string or fragment due to no ?&=#, and your server must accept the uppercase path, either serving it or redirecting to the lowercase and then serving that.)

    An alphabet size of 45 is the largest that will fit into 5½ bits per character (log₂ 45 ≈ 5.49).

    • johndcook 2 days ago ago

      Thanks. Fixed.

      • drfuchs 2 days ago ago

        It still says "44 characters" when I click the link.

        • johndcook 2 days ago ago

          Fixed again. I missed one. :)

          Thanks.

    • chrisandchris 2 days ago ago

      > [...] must accept the uppercase path [...]

      FWIW, the path segment itself is case-sensitive and it comed down to the webserver (and then mostly the filesystem) whether it wants to treat the path case-sensitive or not. There's no guarantee in HTTP that /PATH will serve a path located at /path.

  • ericpauley 2 days ago ago

    A major frustration in my life is that LinkedIn QR codes will not support all caps. It’s not even a profile capitalization issue; the app will refuse to scan the code if the “/in/“ is capitalized. The resulting size difference is quite noticeable particularly in small format.

    • mcdonje 2 days ago ago

      I can't help but interpret that as a clue as to which internal groups hold power over there.

      • rafram 2 days ago ago

        Huh?

        • mcdonje 2 days ago ago

          Lowercase "in" is a major part of their branding. Forcing usage of lowercase "in" in this scenario supports the branding even if it doesn't make sense from an engineering standpoint.

          • jonathanlydall 2 days ago ago

            They can redirect from the upper to the lower case URL so that it still looks the way they want.

            It might not be intentional that it doesn’t work with uppercase, but they just made it lower case and by default it’s case sensitive on whatever software stack they use to host.

          • fukka42 2 days ago ago

            That's why engineering should silently add a toLowerCase and just generate qr codes with lower case as well.

    • tln 2 days ago ago

      I would not link directly to LinkedIn. They have changed the optimal url many times.

    • jdndbxbcb 2 days ago ago

      According to the article / is not in basic alphanumeric alphabet anyway?

      • nemetroid 2 days ago ago

        The article has an off-by-one error. There are 45 characters in the basic alphanumeric alphabet, and / is the missing one.

  • jwr 2 days ago ago

    I invested quite a bit of effort into designing URLs for the PartsBox ID Anything™ system so that they fit well into optimized QR codes. Uppercasing was one of the tricks, and it makes quite a difference indeed. This is important when you want small and yet easily readable/scannable QR codes.

    Later it turned out that when printing labels, you hand your data off to the printer to produce the QR code anyway, and printers do not try very hard to find an optimal encoding.

  • future10se 2 days ago ago

    A similar article was posted earlier this year, discussion here: https://news.ycombinator.com/item?id=43149077

  • zygentoma 2 days ago ago

    Wouldn't it have been much more sensible to have a lowercase version of alphanumerics in the QR code standard? Almost all URLs are lowercase, and even if the have capitalised parts, in most cases they're case-insensitive.

    • orangewindies 2 days ago ago

      QR codes were created for labelling automotive parts, not for URLs. Part numbers are usually uppercase alphanumeric, with a few punctuation characters.

      • 2 days ago ago
        [deleted]
    • mmulqueen 2 days ago ago

      QR codes existed for over a decade before smartphones brought them into the mainstream. They're high density replacements for barcodes, which are uppercase by convention (or in some cases like Code 39, only support uppercase). URLs in QR codes are a later innovation.

    • Etheryte 2 days ago ago

      Urls can include parameters and for those capitalization can definitely matter.

    • 2 days ago ago
      [deleted]
    • vbezhenar 2 days ago ago

      domain name and protocol name is case-insensitive.

      If you care about QR code size, you should use URL shortener service which you can program in whatever way you need.

      So I don't think that's major restriction.

      • Xss3 2 days ago ago

        Then your qr code only works if the url shortener stays online.

        In many cases its wiser to roll your own permanent shortened urls so you aren't beholden to a third party service.

        • vbezhenar 2 days ago ago

          Of course you can roll out your own URL shortener. It's not a problem to buy short domain.

  • OhMeadhbh 2 days ago ago

    In the 2010s I had a startup generating a bunch of QR Codes. We generated URLs with a BASE32 variant plus upper case domain and scheme so we could stay within the 5.5 bits per character encoding. I kept trying to explain this feature to people, but it was an uphill battle. To this day I still use upper case w/ URLs JUST-IN-CASE I want to put them in a QR Code.

    One minor annoyance I have is the ``git`` command line tools (at least the ones distributed w/ Debian) are case sensitive w/ URLs, so if you try the command:

       git clone HTTPS://GIT.BI6.US/TQT
    
    you'll get an error, but

       git clone https://GIT.BI6.US/TQT
    
    does what you might expect it to do. This is a very minor nit since no git tool I know directly consumes QR codes and if you made one, you could lower case the protocol section yourself. It's just that all day I'm using URLs that are intentionally upper-cased and the one time I need to lower-case a portion of it, I always forget.

    I should probably publish the BASE32 variant we used. Mostly just removed the I's O's and a few other letters that could easily be confused with digits or with other letters.

    • craftkiller 2 days ago ago

      Interesting, that means git is violating the URI spec. Quoting RFC 3986:

      > [...] schemes are case-insensitive [...] An implementation should accept uppercase letters as equivalent to lowercase in scheme names (e.g., allow "HTTP" as well as "http") [...]

      https://datatracker.ietf.org/doc/html/rfc3986#section-3.1

    • lifthrasiir 2 days ago ago

      `https://GIT.BI6.US/TGT` is actually not that bad because the optimal encoding will use the 8-bit encoding only for the `https` part and you only need 13 more alphanumeric characters to beat the full 8-bit encoding (if my calculation is correct).

      • fph 2 days ago ago

        Can you mix encodings in a QR code?

  • slig 2 days ago ago

    This tool [1] let me figure that out couple of years ago. And on the printable pages of the sites I own, for instance [2], I use a all caps domain and identifiers so that all my QR Codes are tiny (e.g: `HTTPS://ABC.DE/ABCD/42`, with up to 10 chars in the path).

    [1]: https://www.nayuki.io/page/creating-a-qr-code-step-by-step [2]: https://www.brainzilla.com/logic/zebra/pdf/blood-donation.pd...

  • gregsadetsky 2 days ago ago

    Similar article on this exact topic, with a bit more detail on the QR encoding side, from Terence Eden:

    https://shkspr.mobi/blog/2025/02/why-are-qr-codes-with-capit...

    Past discussion: https://news.ycombinator.com/item?id=43149077

    (self promo-ish) The above blog article was one of the pieces shown at the QR Show that I helped co-organize earlier this year at the Recurse Center. I'm definitely tempted to have another one in 2026!

    The whole lists of pieces is here: https://qrshow.nyc/retrospective.html

  • Theodores 2 days ago ago

    I am amazed at how much there is to know about QR codes, particularly if you want them to look pretty.

    I want the super succinct QR code and I believe that to be optimal. However, I keep seeing massively complicated QR codes, as if going from 8 bit to 64 bit, and I assume these work well. Given the amount of megapixels in any camera made this century and the prevalence of over complicated URLs in QR form, I am not sure if minimised QR codes have any benefit whatsoever. By minimised, I mean 29 x 29.

    On the QR topic, I don't understand how logos in the middle work. You are losing pixels and checks with the logo in the middle which is fine until you make the logo too big.

    Also related, imagine you wanted a HN QR code with 'Hacker News' written in the middle. This would work as a box in the middle but would be hard to read. So you can make a line across the middle rather than a box in the middle. This will break the QR code but not if you rotate the QR code 90 degrees first.

    Maybe my best option to fully understand the quirks is to start with the QR spec and then to make my own QR codes.

    • dgl 2 days ago ago

      > On the QR topic, I don't understand how logos in the middle work. You are losing pixels and checks with the logo in the middle which is fine until you make the logo too big.

      It is possible to add logos without (well, differently) abusing the error correction: https://research.swtch.com/qart

      Of course most images in the middle aren’t doing that and rely on some level of error correction fixing it.

      • sgarland 2 days ago ago

        I love how dedicated some people are to hacking random things for fun. What a great read!

    • crazygringo a day ago ago

      > Given the amount of megapixels in any camera

      It's not just that. It's the resolution of the display device too. It's brightness and whether that is causing bloom. If it's printed small, users may instinctively put the camera too close and it won't be able to focus. People have shaky hands and the image will have motion blur. Glare is a persistent problem with bikeshare apps, where the app turns on your phone's flashlight while scanning for nighttime use, but the QR code is glossy. Codes get scratched up, and the smaller the blocks are, the more they are degraded.

      There really are tons of ways QR codes degrade no matter how many megapixels you have, and smaller codes are always going to be more resistant given the same overall physical size.

    • nomel 2 days ago ago

      > particularly if you want them to look pretty.

      I've been amazed at some of the QR codes I've seen in TV ads. Multiple subdirectories, plethora of parameters containing full city names and redundant zip codes, etc, resulting in massive QR codes that you can't even use from your couch because they're so dense.

    • wongarsu 2 days ago ago

      > Also related, imagine you wanted a HN QR code with 'Hacker News' written in the middle. This would work as a box in the middle but would be hard to read. So you can make a line across the middle rather than a box in the middle. This will break the QR code but not if you rotate the QR code 90 degrees first.

      The outside (the parts between the alignment marks) have metadata, separated from the data by a dotted line. If your line touches the metadata that's bad. But as long as you stay within the data block any shape should work as long as you are not modifying more pixels than the chosen level of error correction can handle

    • fisian 2 days ago ago

      Another fun thing are these stable diffusion/controlnet combinations which create QR codes that at the same are AI generated art. e.g. qrdiffusion or qrbtf

  • lifthrasiir 2 days ago ago

    You don't need to make the whole payload uppercase to benefit from this aspect of QR code, as a single QR code can use multiple different encoding schemes. It is sufficient that a large consecutive portion of your payload is limited to those alphanumeric characters.

  • theandrewbailey 2 days ago ago

    Just this week, I was writing a script to generate a UUID, make a QR code of it, and print both on a label. I uppercased the UUID to hopefully make the label more human readable after scratching and scuffing, and noticed the QR code got smaller.

    Now I know why.

  • FinnKuhn 2 days ago ago

    This also seems to work for URLs so next time I will create a QR code I will keep this in mind. Really useful!

    I tested this with the following example: https://imgur.com/a/hTsvV3Z

    • layer8 2 days ago ago

      That example URL doesn’t work when you convert it to uppercase. ;)

      • FinnKuhn 2 days ago ago

        It definitely does when I scan each barcode with my iPhone.

        • smallerize 2 days ago ago

          The imgur link is case-sensitive though lol.

  • 2 days ago ago
    [deleted]
  • mrasong 2 days ago ago

    First time learning about this detail.

    • degrees57 2 days ago ago

      Same. I am thankful that this sort of article shows up here.

  • efskap 2 days ago ago

    It also changes the number of tokens your LLM works with, e.g. in title case it might treat the capitals as their own tokens.