Tailscale state file encryption no longer enabled by default

(tailscale.com)

354 points | by traceroute66 a day ago ago

134 comments

cronos a day ago ago
I'm one of the Tailscale engineers who built node state encryption initially (@awly on Github), and who made the call to turn it off by default in 1.92.5.
Another comment in this thread guessed right - this feature is too support intensive. Our original thinking was that a TPM being reset or replaced is always sign of tampering and should result in the client refusing to start or connect. But turns out there are many situations where TPMs are not reliable for non-malicious reasons. Some examples: * https://github.com/tailscale/tailscale/issues/17654 * https://github.com/tailscale/tailscale/issues/18288 * https://github.com/tailscale/tailscale/issues/18302 * plus a number of support tickets
TPMs are a great tool for organizations that have good control of their devices. But the very heterogeneous fleet of devices that Tailscale users have is very difficult to support out of the box. So for now we leave it to security-conscious users and admins to enable, while avoiding unexpected breakage for the broader user base.
We should've provided more of this context in the changelog, apologies!
[-]
- snailmailman a day ago ago
  Those issues are a surprising read. I would expect issues with TPM on old or niche devices, but not Dell XPS laptops, or a variety of VMs. But I guess I'm not entirely sure how my vms handle TPM state, or if they even can.
  I'm running nearly all of my personal tailscale instances in containers and VMs. Looking now at the dashboard, it appears this feature really only encrypted things on my primary linux and windows pc, my iphone, and my main linux server's host. None of the VMs+containers i use were able to take advantage of this, nor was my laptop. Although my laptop might be too old.
  [-]
  - 9x39 a day ago ago
    Stuff breaks all the time, you just need a bigger sample size.
    Overseeing IT admins for corp fleets is part of my gig, and from my experience, we get malfunctioning TPMs on anything consumer - Lenovo, Dell, HP, whatever. I think the incidence is some fraction of a percent, but get a few thousand devices and the chance of eventually experiencing it is high, very high. I can't imagine a vTPM being perfect either, since there isn't a hypervisor out there someone hasn't screwed up a VM on.
    [-]
    - c0nsumer 20 hours ago ago
      Many, many more devices here... And good/typical enterprise level hardware... And failing TPMs are just something that happen. It's pretty expected these days. And on Windows when it causes a loss of certificates, it's actually a good bit more of a pain than just a dying disk or display or something, because it's not immediately obvious what's wrong, it just doesn't talk to the network properly anymore, or so.
      I'm not surprised by Tailscale's change here. It's a good move.
    - yndoendo a day ago ago
      The issue could be a bug in the host OS not in the VM. I had a Windows update that broke VMs when the guest OS was Windows running in real-time mode. This was the only issue and if I didn't run real-time VMs I would have never known. The only resolution was to reinstall Windows.
  - slyn a day ago ago
    Just had a system board replaced on a device in my org, Dell laptop.
    As part of setting up a device in our org we enroll our device in Intune (Microsoft's cloud-based device management tool aka UEM / RMM / MDM / etc). To enroll your device you take a "hardware hash" which's basically TPM attestation and some additional spices and upload it to their admin portal.
    After the system board replacement we got errors that the device is in another orgs tenant. This is not unusual (you open a ticket with MS and they typically fix it for you), and really isn't to blame on Dell per se. Why ewaste equipment you can refurbish?
    Just adding 5c to the anecdata out there re: TPM as an imperfect solution.
    [-]
    - Fnoord a day ago ago
      When I replaced a motherboard (rest of the hw was OK) Microsoft was of the opinion I had a 'new computer' and would need to buy a new Windows 10 license (of IIRC 150 EUR → scoundrels). I went to G2A and bought one for 20 EUR. Then it hit me. This occurred before when my previous motherboard/CPU was broken, and back then I actually called Microsoft where they insisted on selling me a new license. I did exactly the same back then.
      [-]
      - literalAardvark 10 hours ago ago
        I've handled technical+legal concerns for licensing for a very small org in a different lifetime, and yes, that's exactly how Microsoft used to think of licenses. I don't know how it works these days, it's someone else's problem.
        We had to archive invoices+servicing documentation for warrantied mobos from the supplier to keep a legal licensing chain.
        [-]
        Fnoord 8 hours ago ago
        I remember the path my license had: it was a free upgrade to Windows 10, from Windows 7 (right before they removed said free upgrade; I tend to be slow with adapter Windows versions). The original Windows 7 license was a pirated one, but that didn't matter (we know why: before GDPR, Microsoft could spy on Windows 10 users, and the pirated Windows 7 was already a lost sale).
        Apparently the free upgrade was OEM, bound to the hardware. I did not know. Either way, I'm from Europe (EU), and here a software license cannot be exhausted via second hand market, so it stands to reason I can buy one second hand. That this isn't what Microsoft support is told to discuss, suuure (even when I explicitly asked for it, they insisted I had to buy it via them).
      - techcode 18 hours ago ago
        I've had quite the opposite experience with Microsoft.
        One time their support just give me a licence for a newer version of Windows - I've replaced the HDD/SSD, cloned/copied it and it was not activated. I contacted their chat support from that laptop and when they asked me for licence on the sticker I mentioned I'll have to come back in 5 minutes since I'll have to turn off laptop, and take out battery to see the MS sticker/hologram.
        Support said "No worries, here's a new activation key".
        Can't recall if it was from XP to Win 7, or Win 7 to 10.
        --
        And after buying 2 or 3 licences from another website just like G2A (Win 10 was ~€10 on Instant-Gaming) - a bunch of new computers (even brand new assembled desktops) were automatically activated.
  - evanjrowley a day ago ago
    My eyes have opened up to the pitfalls of TPM recently while upgrading CPUs and BIOS/UEFI versions on various hardware in my home.
    VMs typically do not use TPMs, so it is not surprising that the feature was not being used there. One common exception is VMware, which can provide the host's TPM to the VM for a better Windows 11 experience. One caveat is this doesn't work on most Ryzen systems because they implement a CPU-based fTPM that VMware does not accept.
    [-]
    - bdavbdav a day ago ago
      AIUI most hypervisors offer vTPM - it’s disabled by default often, but most solutions have it (including Proxmox / KVM (using swtpm)
    - asciii 21 hours ago ago
      I did not realize that the fTPM on CPU can also cause speed lags and stuttering because of the overhead of security stuff
  - zozbot234 a day ago ago
    It is in fact surprising that TPMs can be wiped so easily. It makes them almost useless compared to dedicated solutions like physical FIDO keys or smartcards, and does not bode well for hardware-backed Passkeys that would also be inherently reliant on TPM storage.
    [-]
    - Fnoord a day ago ago
      Not all TPM. I've yet to manage it on my MBP M1 Pro or my Pixel. Of course, M1-M3 have broken secure enclave which cannot be fixed by the user.
      On AMD with fTPM I get a fat warning if I want to reset the fTPM keys. I think earlier implementations failed here.
      > and does not bode well for hardware-backed Passkeys that would also be inherently reliant on TPM storage.
      So you revoke the key and auth in another way (or you use a backup). One passkey is never meant to be the one sole way of auth.
      I actually like the concept. Consider a situation where you would log into your webmail while in a café or bus. If the password is tied to your hardware, nobody can watch over your shoulder to use it on theirs.
      I don't use them much (I've been forced to) because I already use a self-hosted password manager where I never see the password myself. But for the average person, passkeys are better.
      Now, if you compare with FIDO2, those are supposed to be with you all the time (something you have). So they can be used on multiple platforms, while a TPM is tied to hardware.
      [-]
      - arthurcolle a day ago ago
        > Of course, M1-M3 have broken secure enclave which cannot be fixed by the user.
        haven't heard about this, link?
        [-]
        Fnoord a day ago ago
        Called GoFetch, from (approx) Mar 21 2024 [1]. In 2022 there was a side channel attack called Augury on M1 / A14. The article refers to it.
        [1] https://arstechnica.com/security/2024/03/hackers-can-extract...
        [-]
        hollerith 3 hours ago ago
        Not good, but that really doesn't sound like a vuln in the Secure Enclave but rather in the main CPU.
    - heavyset_go 19 hours ago ago
      You can DoS many physical FIDO tokens by using the wrong PIN on purpose several times.
      They're programmed to lock or reset as a security measure. If they're locked, they need a special process, software and credentials to unlock them, which you might not have immediate, or any, access to.
      If they reset, it's no different than wiping a TPM.
  - braiamp 9 hours ago ago
    As some kernel developers have said: motherboard manufacturers are really bad making sure stuffs works.
  - Macha a day ago ago
    I had a Ryzen 3900x on a gigabyte motherboard and the fTPM was just totally unreliable for a pretty mainstream combination. Not fully sure which was to blame there.
    At least it was fixed in the 5900x (and _different_ gigabyte motherboard, but from the same lineup) that replaced it.
    [-]
    - Marsymars a day ago ago
      This jumped out to me because I had a TPM problem on an FM2 Gigabyte mobo in ~2015. (Back when a TPM on desktop mobos required a plug-in module.)
      It took me months of hassling Gigabyte to get them to issue me a beta BIOS that fixed the bug, and the fix never did make it to a non-beta BIOS.
  - justincormack 13 hours ago ago
    VMs don't have TPMs as they are hw devices, although you can run a software TPM (potentially backed by the host TPM) and pass it to them, which you might want to do for this use case.
    [-]
    - wolvoleo 8 hours ago ago
      That would be nice, in that case you can extract also keys that apps store in there. Interesting, I'll try that out.
  - behringer a day ago ago
    I'm not sure what makes any of this "surprising". Each ticket reads like "we replaced the computer that tailscale was on, it doesn't work anymore" pikachu face.
    Yeah, that was a feature and the exact reason why we use TPMs. I guess it should have been better advertised.
- sydbarrett74 a day ago ago
  That's an eminently reasonable and logical policy. Thanks for the context.
- traceroute66 a day ago ago
  @cronos
  Question:
  You link to https://github.com/tailscale/tailscale/issues/17654 where a user states[1]:
  "Previous workaround from some comments (TS_ENCRYPT_STATE=false, FLAGS="--encrypt-state=false") didn't help on this problematic Debian 13 host"
  And the same user states "I confirm this issue is NOT found anymore with tailscale version 1.92.1".
  Could you provide a little extra context to clarify those types of comments which seem to suggest it wasn't state encryption after all ?
  [1] https://github.com/tailscale/tailscale/issues/17654#issuecom...
  [-]
  - cronos a day ago ago
    There are two new-ish features in Tailscale that use TPMs: node state encryption (https://tailscale.com/kb/1596/secure-node-state-storage) and hardware attestation keys.
    Hardware key attestation is a yet-unfinished feature that we're building. The idea is to generate a signing key inside of the TPM and use it to send signatures to our control plane and other nodes, proving that it's the same node still. (The difference from node state encryption is that an attacker can still steal the node credentials from memory while they are decrypted at runtime).
    We started by always generating hardware attestation keys on first start or loading them from the TPM if they were already generated (which seemed safe enough to do by default). That loading part was causing startup failures in some cases.
    To be honest, I didn't get to the bottom of all the reports in that github issue, but this is likely why for some users setting `--encrypt-state=false` didn't help.
    [-]
    - traceroute66 a day ago ago
      Also I assume "off by default" also affects macOS, iOS and Android users who don't rely on TPM at all ?
      [-]
      - cronos a day ago ago
        Nope, only Windows/Linux where TPMs exist.
- dietr1ch a day ago ago
  I too thought that the TPM was something to be trusted with a secret until a BIOS upgrade just wiped mine. I'm not relying on TPM again.
  [-]
  - johncolanduoni a day ago ago
    It was designed mostly for mechanisms where in the event of certain changes (BIOS upgrades, certain other firmware changes, some OS changes) there is a fallback mechanism to unlock the system and reset the key. This is why Windows BitLocker is so insistent about you saving your key somewhere else - if you do a BIOS update and it can’t decrypt, it’ll require your copy of the key and then reset the TPM-encrypted copy with the new BIOS accounted for.
    A TPM’s primary function works by hashing things during the boot process, and then telling the TPM to only allow a certain operation if hashes X & Z don’t change. Depending on how the OS/software uses it, a whole host of things that go into that hash can change: BIOS updates being a common one. A hostile BIOS update can compromise the boot process, so some systems will not permit automatic decryption of the boot drive (or similar things) until the user can confirm that they have the key.
- nathanlied 18 hours ago ago
  Thank you for your openness here - and yes, it would be nice to see this kind of reasoning in the changelog, even if it's tucked a little out of the way! Those of us who care will read it.
  Also very welcome is to separate it into a small blogpost providing details, if the situation warrants a longer, more detailed format.
- AceJohnny2 a day ago ago
  Thanks! In your change https://github.com/tailscale/tailscale/pull/18336 you mention:
  > There's also tailscaled-on-macOS, but it won't have a TPM or Keychain bindings anyway.
  Do you mean that on macOS, tailscaled does not and has never leveraged equivalent hardware-attestation functionality from the SEP? (Assuming such functionality is available)
  [-]
  - cronos a day ago ago
    On macOS we have 3 ways to run Tailscale: https://tailscale.com/kb/1065/macos-variants Two of them have a GUI component and use the Keychain to store their state.
    The third one is just the open-source tailscaled binary that you have to compile yourself, and it doesn't talk to the Keychain. It stores a plaintext file on disk like the Linux variant without state encryption. Unlike the GUI variants, this one is not a Swift program that can easily talk to the Keychain API.
    [-]
    - cyberax 21 hours ago ago
      You don't need Swift to use the Keychain API. It's doable from pure C.
      [-]
      - cronos 8 hours ago ago
        Good to know, my understanding of the macOS system APIs is fairly limited. I'm sure it's doable, with some elbow grease and CGO. We just haven't prioritized that variant of the client due to relatively low usage.
        [-]
        cyberax 5 hours ago ago
        If you want to avoid Cgo, you can use https://github.com/ebitengine/purego or Goffi to call the native functions. It's a bit cursed, but works.
      - johncolanduoni 21 hours ago ago
        In fact, SecurityFramework doesn’t have a real Swift/Obj-C API. The relevant functions are all direct bindings to C ABIs (just with wrappers around the CoreFoundation types).
        [-]
        lloeki 16 hours ago ago
        > The third one is just the open-source tailscaled binary that you have to compile yourself, and it doesn't talk to the Keychain.
        I use this one (via nix-darwin) because it has the nice property of starting as a systemwide daemon outside of any user context, which in turn means that it has no (user) keychain to access (there are some conundrums between accessing such keychains and "GUI" i.e user login being needed, irrespective of C vs Swift or whatever).
        Maybe it _could_ store things in the system keychain? But I'm not entirely sure what the gain would be when the intent is to have tailscale access through fully unattended reboots.
    - reader9274 19 hours ago ago
      Only one of the ways uses Keychain per that page.
      [-]
      - cronos 8 hours ago ago
        Ah, looks like another KB update is needed, thanks for calling it out!
- pja a day ago ago
  A BIOS update to my PC reset the TPM only this week. I did get a warning that Bitlocker keys would be wiped as a result before acting at least.
  (I believe this was because it was fixing an AMD TPM exploit - presumably updating the TPM code wipes the TPM storage either deliberately or as an inevitable side effect.)
  [-]
  - plagiarist a day ago ago
    TPMs are basically storing the hashes of various pieces of software, then deterministically generating a key from those. Since the BIOS software changed, that hash changed, and the key it generates is completely new.
    If someone had messed with your BIOS maliciously, that's desirable. Unfortunately you messing with your BIOS intentionally also makes the original key pretty much unrecoverable.
    [-]
    - cronos a day ago ago
      IIUC, it's a bit more nuanced: TPM stores hashes of various things like firmware in PCRs, and when creating keys in the TPM you can optionally bind the key to specific PCR values. But you also don't have to (and Tailscale doesn't), in which case keys survive firmware updates for example.
- lloeki 17 hours ago ago
  Coincidentally this was a feature unknown to me until I performed a SSD migration from one server to another and Tailscale failed to connect because ("of course!" in hindsight) it failed to decrypt whatever.
  So not a TPM failure but certainly a gotcha! moment; luckily I had a fallback method to connect to the machine, otherwise in the particular situation I was in I would have been very sorry.
  The "whoever needs this will enable it" + support angle makes total sense.
- miki123211 16 hours ago ago
  So this is only disabled on platforms that use a TPM, e.g. Linux and Windows? What about Mac OS?
  [-]
  - cronos 8 hours ago ago
    The macOS client uses the keychain by default, that's not changed here .
- zdware 19 hours ago ago
  i just started using tailscale and responses like this make me believe in the product. awesome!
- Thaxll a day ago ago
  Did you rely on the Google go tpm lib for that?
  [-]
  - cronos a day ago ago
    Yes, we use github.com/google/go-tpm/tpm2
- dist-epoch a day ago ago
  Your suspicion is correct. I have an AMD AM5 motherboard, and everytime I update it's BIOS it warns me that the fTPM will be reset, and I know it does so because afterwards Bitlocker prompts me to introduce the recovery key since it can't unlock the drive anymore.
- keepamovin 20 hours ago ago
  Does this mean TS is not FIPS 140-3 now?
  [-]
  - tatersolid 11 hours ago ago
    It never was FIPS-approved and likely will never be. The wireguard protocol used by Tailscale uses ChaCha20 for encryption which is not FIPS approved.
    [-]
    - keepamovin 10 hours ago ago
      Interesting. What is the FIPS version of wireguard?
      [-]
      - cronos 8 hours ago ago
        There are some forks that are not compatible with regular wireguard, for example from wolfssl. Or just classic mTLS.
- jkaplowitz 19 hours ago ago
  Thank you for explaining that context!
xyzzy_plugh a day ago ago
This never should have been on by default. The end user (read: administrator) needs to know they want to use the TPM.
This is a huge foot gun for many devices.
The accompanying changelog note hints at why:
> Failure to load hardware attestation keys no longer prevents the client from starting. This could happen when the TPM device is reset or replaced.
This is unfortunate as for many, many deployments, you absolutely want this on. But because it's a time bomb for certain device/OS combinations that Tailscale can't control or predict, and folks install Tailscale on pretty much everything, then the incidence of borked installs can only rise.
[-]
- candiddevmike a day ago ago
  As someone with a passing interest in using TPM for crypto things, everytime I think deeply about the implementation details like this, I come back to needing some kind of recovery password/backup key setup that entirely negates the point of the TPM in the first place. They seem really neat, but I struggle to see the benefit they have for doing crypto when a tiny slip up means your users' keys are poof, gone. And the tiny slip up may not even be with your software, but some edge case in the OS/TPM stack.
  [-]
  - johncolanduoni a day ago ago
    The TPM was never designed to be the only holder of a key that cannot be reset. The idea was that it prevents you from typing in a password or reseting an attestation signature in a database for 99% of boots, but if certain things in the boot process change (as determined by the firmware, the CPU, the OS, and the application using the TPM) it’s designed to lock you out so those things cannot change without anyone’s notice.
    For that purpose they’re pretty good, though there are advantages to a more signature-oriented boot security option like Apple’s Secure Enclave. But that only works so well because Apple simply doesn’t permit altering parts of the macOS boot process. For Windows/Linux, you have a variety of hardware, firmware, and OS vendors all in the mix and agreeing on escrow of keys for all of them is hard.
    [-]
    - themafia 13 hours ago ago
      The presumption is that the contents being secured are /so/ valuable that locking my device is preferable to any leak of them whatsoever.
      This is military level security and just isn't appropriate for most consumers. Particularly around something so rarely exercised and utilized by users as the boot process. A simple warning with a long timeout would have sufficed.
      Aside from that you have a hardware vendor, sourced into an integrated product from another vendor, sold to a user, with various third party software interacting with it. This was always going to result in questionable experiences for end users.
      [-]
      - johncolanduoni 6 hours ago ago
        A warning doesn’t help at all. The main threat model for FDE is that someone steals your device and dumps the disk. If you don’t protect the boot process somehow, then you’re just storing the encryption key next to the data.
        If you don’t care about that (which is not “military level security”, laptop thieves stealing creds is a thing), just don’t use FDE or use it with an on-boot password every time. No point in the theater.
        [-]
        themafia 2 hours ago ago
        > laptop thieves stealing creds is a thing
        Two factor is a thing. FDE is such a 1990s idea.
    - paulddraper 20 hours ago ago
      Whether by design or accident, this is correct.
      You backup a key or key creation mechanism or whatever elsewhere somewhere very safe.
      Then almost never touch it, as the TPM authenticates.
  - belorn a day ago ago
    The primary argument in favor of TPM's is the desire to assert against tampering to the boot system, and as a secondary effect it can be one of the solutions to reduce the need for users to type in passwords.
    You can still use crypto without a TPM, including with full disk encryption, and for LUKS specifically you can use multiple passwords and mechanisms to unlock the system. Different solutions will give different benefits and drawbacks. Me and a friend wrote a remote password provider for Debian called Mandos which uses machines on the local network as a way for unattended boots. It does not address the issue of tampering with the bios/boot loader, but for the primary purpose of protecting against someone stealing the server/disks it serves the purpose of allowing us to use encrypted disk without drawbacks of typing in passwords, and the backup server, itself with encrypted disks, handles the risk of needing recovery passwords. At most one needs to have an additional backup key installed for the backup server.
  - briHass a day ago ago
    TPM keys are great for things like SSH keys or Passkeys, which surprisingly works well even in Windows.
    The private key is safe from any exfiltration, and usage only requires a short PIN instead of a long passphrase. The TPM ensures you're physically typing that PIN at the machine not a remote desktop window or other redirection that could be hacked.
    Obviously, this is problematic/annoying for scripts and things that can't share the SSH session, because you need to PIN with every authentication. Also, for encryption, you want to use something where you can backup the private key before stashing it in the TPM. Windows allows you to do this with certificates that are exported for backup prior to encrypting the private key with an unexportable TPM key in Hello.
    [-]
    - johncolanduoni 21 hours ago ago
      An easy solution to having to put your PIN in too often for SSH is to use the `ControlPersist` option in your SSH client config. This lets you only create a new SSH connection every 30s (or whatever you put), even if you’re doing separate operations. With a low timeout, there’s no realistic security risk (what’s the chance an attacker will only have control of your machine for 30s?).
      I do this for GitHub in particular, because of tools that connect to the remote multiple times. Works with anything that uses the actual ssh executable under the hood.
  - nottorp a day ago ago
    Same with passkeys actually.
    [-]
    - SchemaLoad a day ago ago
      Passkeys get synced between your devices so they aren't any more fragile than passwords in a password manager.
      [-]
      - lsowen a day ago ago
        Passkeys _may_ be synced, but that isn't guaranteed. For example a "device bound passkey" isn't synced.
        [-]
        tadfisher a day ago ago
        There is a project under way to specify how to "sync" device-bound keys between authenticators: https://fidoalliance.org/specs/cx/cxp-v1.0-wd-20241003.html
        Ideally this should have been hashed out before deploying passkeys everywhere, but I guess you can always register multiple passkeys for the sites that allow you to.
        [-]
        nottorp 15 hours ago ago
        Iirc the original idea was that passkeys should be device specific. Of course that's impractical so now they're morphing to a long password that a human can't process.
        In a few years someone will post "how about a long human retainable passphrase?" as a new and improved discovery.
      - 0cf8612b2e1e a day ago ago
        The big providers only want themselves to be able to backup passkeys. I do not want to handover my secrets to Apple/Microsoft/Google.
        [-]
        lilyball 5 hours ago ago
        Apple Keychain syncing is end-to-end encrypted, Apple cannot see the contents of your synced keychain.
  - XorNot a day ago ago
    The benefit is that you don't enter the recovery password most of the time.
    And when you do it should be rare and lead to a password reset.
- traceroute66 a day ago ago
  But e.g. Windows uses a TPM by default now ? If TPMs were such a major issue then there would be millions of Windows users with TPM problems, no ?
  I have no inside info, but this strikes me more as a bit of a "sledgehammer to crack a nut". Tailscale turning off important functionality due to small-but-vocal number of TPM edge cases ?
  It is also very unfortunate they did not manage to find any middle ground between the hard-binary all-on or all-off.
  [-]
  - cronos a day ago ago
    Windows uses TPM for Bitlocker. A very common scenario where TPMs get reset is BIOS updates (when a TPM is implemented in firmware). AFAIK, Windows cheats here because it also manages BIOS updates. When an update happens, it takes extra steps to preserve the Bitlocker encryption key in plaintext, and re-seals it to the TPM after the update completes.
    Apart from Windows, there are many setups that fail in fun ways: Kubernetes pods that migrate from one VM with a TPM to another one, hypervisors that mount a virtual TPM to VMs, containers or VM images that do Tailscale registration on one machine and then get replicated to others, etc.
    Tailscale already did some attempts at cleverness when deciding whether to enable features using a TPM (e.g. probing for TPM health/version on startup, disabling node state encryption on Kubernetes pods), but there was still a long tail of edge cases.
    [-]
    - pregnenolone 14 hours ago ago
      > Bitlocker encryption key in plaintext
      Actually, this is not the case. BitLocker wraps the key, meaning even if the TPM were compromised, one would still have to brute-force the PIN for the actual key. It’s cryptsetup on Linux that stores the key on the TPM in plaintext. This vulnerability has been known for quite a while and nothing has been done about it so far.
      https://arxiv.org/abs/2304.14717
      https://github.com/systemd/systemd/issues/37386
      https://github.com/systemd/systemd/pull/27502
    - gck1 a day ago ago
      > Windows cheats here
      Slightly off-topic: it also cheats in how TPM works for Bitlocker when you do TPM + PIN. One would assume PIN becomes part of the encryption key, but in reality, it's just used as the auth for TPM to release the key. So while it sounds like a two-factor solution, in reality it's just single factor.
      So the Bitlocker without TPM is actually a better idea and Windows makes it very painful to do if TPM is on.
      [-]
      - _flux 11 hours ago ago
        Aren't PINs usually short, and might even be really be made out of just digits in the first place? So would there be real security benefits in adding that to the key?
        [-]
        gck1 9 hours ago ago
        You can make PINs as complex as you want, there's only a maximum length limitation of 20 characters. There's no difference between passwords and PINs in Windows except that Windows calls it a PIN if it's used with TPM. And yes, it does nudge you in the direction of making it simple because "TPM guarantees security", but you don't have to.
      - ninkendo 20 hours ago ago
        I don’t know much about the TPM but if it’s anything like Apple’s Secure Enclave, it should require exponentially longer time after each incorrect PIN past the first one, making it so you can’t reasonably brute force it without getting lucky.
        I’m not sure how the typical “two factor” best practices would interpret one of the factors basically self destructing after 10 guesses, but IMO it’s a pretty decent system if done right.
        [-]
        fc417fc802 19 hours ago ago
        That's not the issue. The TPM isn't blinded in the above description meaning that if someone cracks the TPM they can get your key. Ideally both factors are always required to access the secret.
        If you're wondering, yes this is a security issue in practice. There have been TPM vulnerabilities in the past that enabled exfiltration of secrets.
    - kozziollek a day ago ago
      > Windows cheats here because it also manages BIOS updates
      Is this (relatively) new?
      I don't use TPM and I rarely update BIOS unless I really need to, but I thought there was an option on my BIOS/UEFI to use USB drive to update it. How would Windows know about it?
      [-]
      - toast0 a day ago ago
        Window can get BIOS updates through windows update, if the OEM participates and packages them. I haven't seen BIOS updates through windows update on my systems where I built it from components, I've only seen it on integrated systems from major builders (HP, Lenovo, etc).
        The BIOS update instructions for my retail packaged motherboard indicate to turn off BitLocker before doing upgrades to prevent loss of TPM turning into a loss of access, but it'd be easier if it were automated.
      - hnuser123456 a day ago ago
        You can update with a USB drive, but if you have bitlocker enabled and don't temporarily disable it before the BIOS update, you'll need to reformat and reinstall Windows.
        [-]
        dist-epoch a day ago ago
        No, you can save a recovery key to a file or enter it from a printed one.
        [-]
        arjie a day ago ago
        I believe you can also get it from your online Microsoft account if that's what you logged in with once. I ran into this a while ago and had to do it that way. I didn't even know I'd set up Bitlocker.
    - c0nsumer 20 hours ago ago
      On Windows, certificates can also be stored in the TPM.
  - toast0 a day ago ago
    Windows seems to do two big things with a TPM. Bitlocker encryption and some microsoft account stuff.
    If the bitlocker stuff goes wrong, big problem, hopefully you printed and kept your recovery key.
    If the microsoft account stuff goes wrong, mostly the microsoft store and microsoft store apps break in subtle ways... but that's also how that ecosystem normally works, so how are you supposed to know it's the TPM problem?
  - jpk2f2 a day ago ago
    Windows automatically reinitializes the TPM if it's reset boots normally, most end users will not notice any issues unless they have Bitlocker or biometrics configured.
  - JasonADrury a day ago ago
    The problem here seems to mostly have been that some exotic virtualization software insists on offering broken TPM.
- abtinf a day ago ago
  >> This could happen when the TPM device is reset or replaced.
  Isn’t that exactly the desired behavior to defend against physical attacks?
  [-]
  - horsawlarway a day ago ago
    Sure, but most users probably don't actually want this level of defense.
    For the same reason that most folks don't use bank vault doors on their house.
    Ex - even reasonably technical people hit this footgun in lots of edge cases... like updating their bios, changing the host of a vm running the tool, or having a k8s pod get scheduled on a different node.
    I'm surprised this was "default on" at all.
  - SchemaLoad a day ago ago
    Yes, but it turns out the TPM gets reset quite often on shitty hardware.
rstat1 a day ago ago
Here’s the PR explaining why they disabled this function
https://github.com/tailscale/tailscale/pull/18336
Seems like it caused tons of problems due to the variability of TPM quality among other things
jkaplowitz a day ago ago
From the changelog, it seems like this may have been due to issues caused by the on-by-default setting, although I don’t work for Tailscale and am speculating here with no inside info.
I wonder, would Tailscale be willing to confirm that they plan to fix whatever the issues are and re-enable this default within a short-ish timeframe? I currently have plenty of trust in the good intentions of the people running Tailscale, but with geopolitics as it currently is, I’d love to have a concrete reason even beyond that positive track record to believe that this change isn’t attempting to satisfy ease-of-surveillance concerns expressed by government agencies in whichever country.
[-]
- aiiane a day ago ago
  Seems like the issues in question are not within Tailscale's span of control (basically, the devices themselves with TPMs are too unreliable in the general population, so the feature is more appropriate for controlled environments that opt in to its usage).
  [-]
  - db48x a day ago ago
    The TPM devices themselves are reliable, but using them comes with a lot of caveats. 99% of users have never heard of the TPM, and 99% of the ones who have won’t have realized that upgrading the BIOS clears¹ the TPM. Add in the fact that Tailscale users didn’t _know_ that tailscale was using the TPM and you have a recipe for users breaking things without realizing it. In an enterprise environment where you can afford to hire people specifically to care about these thing, using TPMs for additional security is a great idea.
    ¹: and very few of those can explain that it doesn’t actually clear the TPM. Instead it causes a different state to be measured by the TPM, and in that new state the TPM cannot unlock the keys that were previously stored in it. This is a great way to protect the computer against someone who can pull the hard drive out of the computer and try to read the data off of it, or who can substitute a different BIOS chip to get around a BIOS password, but not so great for ordinary users who want the occasional upgrade to go smoothly.
zero0529 a day ago ago
Thank god, I was running Tailscale on a nixos machine on some really old hardware and I couldn’t figure out why it kept crashing. It was because of this but it just failed silently.
jsiepkes a day ago ago
Guessing it was too support intensive? Caused too many issues for people who then reached out to support?
issung 15 hours ago ago
My Tailscale was broken for the past month and I only just fixed it yesterday, and today this patch is released that would have made it a non-issue.
Updating my BIOS caused the issue. The main problem was that Tailscale's behaviour was very poor in this case. It simply got stuck "Starting" and never provided any error information.
maufl 16 hours ago ago
Oh, I got bitten by that! I have my work Linux installation on an USB stick so I can boot it on either my desktop or laptop and one day tailscale stopped working. I thought that might be a rare situation, but it looks like TPM based encryption failed for other reasons too.
sp_c a day ago ago
So in linux it looks like we just update /etc/default/tailscaled with:
FLAGS="--encrypt-state"
...and hope for the best?
edit: I see this in my logs, I guess it is working:
migrated "/var/lib/tailscale/tailscaled.state" from plaintext to TPM-sealed format
quotemstr 20 hours ago ago
And this is why in computing we can't have nice things. Any mass market profit can't, in a business realistic evaluation, ship something that breaks even 1% of users.
Consequently, we're stuck with lowest common denominator everything and have a hard time delaying software fixes for what ails us. Instead of fixing things, we are best encapsulate the damage.
If I were running Tailscale, I'd say "Fuck the people with broken TPMs. Fix your computers. We're going to be secure by default."
I guess there's a reason Avery and not I call the shots there
traceroute66 a day ago ago
TL;DR: Significant u-turn by tailscale.
Previously with Tailscale 1.90.2 or later node state storage encrypted by default on all supported platforms.
As of yesterday, per changelog, state file encryption and hardware attestation keys are no longer enabled by default.
This effectively rolls back history to pre 1.90.2 and you will now have to enable it manually like you did during the public beta period (>= 1.86) of this short-lived new feature.
[-]
- snailmailman a day ago ago
  Not sure if its a "significant" u-turn, when its a relatively new feature. Its only been out for a few months, and seems to be getting rolled back because it was breaking things.
  Its annoying that a security benefit is being turned off, but it can be turned back on if you are confident it will not break your setup.
  [-]
  - traceroute66 a day ago ago
    > Not sure if its a "significant" u-turn
    I would say it is because they made a big marketing blog post about it at the time[1] (August 2025). So clearly they considered it a significant new feature.
    The blog post ended with the words "If we don’t spot any major regressions with 1.86, the next stable release will likely turn on state encryption by default for all new nodes". It was then enabled by default 1.90.2 onwards (October 2025).
    That is why I would consider it a significant u-turn.
    [1]https://tailscale.com/blog/encrypting-data-at-rest
    [-]
    - hug a day ago ago
      I don't get it. It seems like they're doing largely what they said they would.
      They wanted to push a feature, and they said they would if they didn't see any major regressions. Then they did see a major regression, so they pulled the feature.
      Exact version numbers, timelines, and builds are pretty irrelevant to that process. Or are you actually saying you would prefer they had just left their product broken for a significant portion of users, just to keep aligned with the version numbers they mentioned in a blog post?
- Thaxll a day ago ago
  TPM is really badly implemented. When you upgrade your firmware, OS, everything can go south.
  Just upgrading your firmware with bitlocker enabled can brick your PC.
  [-]
  - londons_explore a day ago ago
    Windows uses full disk encryption with keys from the TPM by default.
    Nobody says "disable disk encryption right away incase the tom forgets the keys". The vast majority of TPM's manage to not forget the keys.
    [-]
    - snailmailman a day ago ago
      They may not say "turn off bitlocker", but people definitely recommend backing up the recovery keys, and windows allows you to back up the key to microsoft because they know people won't actually back them up. Not sure if that happens by default, but they provide a variety of options for the recovery keys because there is definitely a non-zero chance you need them. There were several stories of this happening with the windows 10->11 upgrade push, where people were auto-updated and then scrambling to decrypt their hard drives.
    - bmandale a day ago ago
      If windows is encrypted with keys from the TPM anyways, then tailscale doesn't need to encrypt a second time.
      Windows also bit me in the ass with this feature, but tailscale not enabling encryption wouldn't have helped one iota.
      [-]
      - oktoberpaard a day ago ago
        Local software could be stealing plaintext secrets from your encrypted disk. Physical access is not the only attack vector.
        [-]
        bmandale 17 hours ago ago
        The only way to protect against that is if a secure application boundary is enforced by the operating system. You can make it harder for other programs to uncover secrets by encrypting them, but any other application can reverse the encryption. I don't believe using the tpm meaningfully changes that situation.
    - nottorp a day ago ago
      I'm curious. If the motherboard with the TPM dies, you're basically locked out of your data right? Keys backed up on MS server or not.
      [-]
      - dist-epoch a day ago ago
        No, the backed up keys (MS server, file, printed) give you full access, they contain the full encryption key.
        [-]
        db48x a day ago ago
        I suspect that they do not actually contain the encryption key. It is more convenient if the disk encryption key is stored on the disk, but separately encrypted. You actually want to store the key multiple times, one for each unlock method. If the disk can be unlocked with a password, then you store the key encrypted using the password (or encrypted using the output of a key derivation function run on the typed password). If it can be unlocked with a smartcard, then you store a copy that is encrypted using a key stored in the card. When Bitlocker uses the TPM, it no doubt asks the TPM to encrypt the key and then stores that on the disk. To decrypt the disk it can ask the TPM to decrypt the stored key, which will only succeed if the TPM is in the same state that it was in when the key was encrypted.
        The reason it's done this way is to allow multiple methods of accessing the disk, to allow the encryption password to be changed without having to rewrite every single sector of the disk, etc, etc. You can even “erase” the disk in one swift operation by simply erasing all copies of the key.
        [-]
        PunchyHamster 14 hours ago ago
        That is also required for any kind of key rotation to work, you're getting new key for a key, because alternative of using key directly would mean re-encrypting the whole drive when it changes and of course only having single one instead of multiple
        [-]
        nottorp 13 hours ago ago
        So if you’re using the TPM based encryption you’d better have a working backup system.
        How many home users have that? How many stories of personal data loss are we going to hear as windows 11 ready PCs start to die?
  - traceroute66 a day ago ago
    > TPM is really badly implemented. When you upgrade your firmware, OS, everything can go south.
    Could you elaborate ? Firmware/OS should not affect TPM contents ? Otherwise e.g. TPM-reliant Windows installs would break ?
    In addition there are cloud scenarios where your VM has a TPM and you want to e.g .stop a malicious actor poaching your VM and running it elsewhere.
    Having the tailscale TPM tied to your cloud hypervisor prevents the "lift and shift" attack.
    [-]
    - Thaxll a day ago ago
      Everytime I have to upgrade my MB firmware it breaks bitlocker and I have to either use restoring keys from microsoft website or disable bitlocker encryption before the upgrade.
      https://www.reddit.com/r/MSI_Gaming/comments/15w8wgj/psa_tpm...
    - stanac a day ago ago
      You cant reliably store secrets in tpm and expect it to work after an os update. Windows is using workarounds during windows update to avoid breaking bitlocker.
      https://learn.microsoft.com/en-us/windows/security/hardware-...
    - db48x a day ago ago
      You are correct. Updating the firmware or the OS does not actually erase the TPM. What is really going on is that the TPM register holds a value that is like a hash. Each time you measure the system state you update the register with a hash of the previous value and the measurement. When you ask the TPM to hold a key you specify which register value is used to encrypt the key. Later when you use the key it will fail if the TPM cannot decrypt the key. This can only happen if the TPM register has the wrong value, which can only happen if someone has tampered with the system. But voluntarily upgrading the BIOS or the OS looks exactly like tampering.
      The correct procedure is to unlock the keys, copy them out of the TPM, perform the upgrade, reboot to remeasure the system state, then finally store the keys back into the TPM.
  - adrr a day ago ago
    Wouldn't you want TPM to brick the machine if the firmware was modified? If something or someone modified your firmware, do you want the TPM key to remain intact? Its something you need to be aware of when upgrading firmware, disable encryption that relies on TPM or make a backup copy of the key.
- asgeirn a day ago ago
  From what I can deduce from the release notes and the linked documentation, it can still be enabled?
  And it relates to Windows and Linux only, and using the TPM.
  My guess is that unreliable TPMs made it risky to have this enabled by default.
  [-]
  - traceroute66 a day ago ago
    > it can still be enabled?
    Yes, just like >= 1.86, you set a flag during install.
    But that's not the point.
    The point is that >= 1.90.2 it became enabled by default.
    The point is that most people would expect that "by default" to be a permanent fixture, i.e. a sane secure-by-default config.
    This means that people with automated deployments based on >= 1.90.2 can no longer rely on the "by default" and this now needs to be flagged.
    [-]
    - esseph a day ago ago
      If your threat profile has you worried about tailscale + tpm, you probably shouldn't be running talescale unless you're also running headscale...
      Just a thought.
- shepherdjerred a day ago ago
  what's the implication?
  [-]
  - usefulposter a day ago ago
    Help center - https://tailscale.com/kb/1596/secure-node-state-storage:
    >Secure node state storage can help protect against a malicious actor copying node state from one device to another, effectively cloning the node. By using platform-specific capabilities, Tailscale ensures node state encrypts at rest, making theft from disk and node cloning more difficult.
    Marketing blogpost - https://tailscale.com/blog/encrypting-data-at-rest:
    >What we really care about here are those private keys stored in the state file, since those are used to identify your node to the coordination server and to other nodes. We need to protect them from exfiltration.
    >If the Tailscale state file is unencrypted, an attacker with that kind of root access could use the file’s contents from a different machine and impersonate your node. From the perspective of the Tailscale coordination server, it’s as if your device switched to a different network and got a new IP address. We call this attack “node cloning”.
    [-]
    - nottorp a day ago ago
      So it protects you from an attacker who already has local root?
      [-]
      - cronos a day ago ago
        Not even that. An attacker with local root can just extract the wireguard keys from process memory, or use the TPM to decrypt the state file like Tailscale would.
        The only scenario where it helps is a local attacker who can read the state file on disk, but is not full root. Kinda unlikely on Linux, but could happen on Windows.
        [-]
        nottorp 15 hours ago ago
        > An attacker with local root can just extract the wireguard keys from process memory, or use the TPM to decrypt the state file like Tailscale would.
        That was my point :)
  - traceroute66 a day ago ago
    Historical blog post from tailscale (August 2025) saying how awesome and important this feature was[1].
    TL;DR If you care about the stuff mentioned in that blog post (which most sensible sysadmins would) then the implication is that you are no longer protected against those threat scenarios UNLESS you manually apply the flag at install time.
    Which means for people using deployment scripts/tools you now need to update those to put the flag in during installation. Because previously you could rely on the feature being "on by default", which is no longer the case.
    [1]https://tailscale.com/blog/encrypting-data-at-rest