Branch Privilege Injection: Exploiting branch predictor race conditions

(comsec.ethz.ch)

421 points | by alberto-m 7 months ago ago

219 comments

progval 7 months ago ago
Researchers' blog post: https://comsec.ethz.ch/research/microarch/branch-privilege-i...
Paper: https://comsec.ethz.ch/wp-content/files/bprc_sec25.pdf
[-]
- dang 7 months ago ago
  Thanks! We've changed the URL above from the university press release (https://ethz.ch/en/news-and-events/eth-news/news/2025/05/eth...) to that first link.
- ncr100 7 months ago ago
  Impact illustration:
  > [...] the contents of the entire memory to be read over time, explains Rüegge. “We can trigger the error repeatedly and achieve a readout speed of over 5000 bytes per second.” In the event of an attack, therefore, it is only a matter of time before the information in the entire CPU memory falls into the wrong hands.
  [-]
  - formerly_proven 7 months ago ago
    Prepare for another dive maneuver in the benchmarks department I guess.
    [-]
    - tsukikage 7 months ago ago
      We need software and hardware to cooperate on this. Specifically, threads from different security contexts shouldn't get assigned to the same core. If we guarantee this, the fences/flushes/other clearing of shared state can be limited to kernel calls and process lifetime events, leaving all the benefits of caching and speculative execution on the table for things actually doing heavy lifting without worrying about side channel leaks.
      [-]
      - tankenmate 7 months ago ago
        I get you, but devs struggle to configure nginx to serve their overflowing cauldrons of 3rd party npm modules of witches incantations. Getting them securely design and develop security labelled cgroup based micro (nano?) compute services for inferencing text of various security levels is beyond even 95% of coders. I'd posit that it would be a herculean effort even for 1% devs.
        Just fix the processors?
        [-]
        tsukikage 7 months ago ago
        It's not a "just" if the fix cripples performance; it's a tradeoff. It is forced to hurt everything everywhere because the processor alone has no mechanism to determine when the mitigation is actually required and when it is not. It is 2025 and security is part of our world; we need to bake it right into how we think about processor/software interaction instead of attempting to bolt it on after the fact. We learned that lesson for internet facing software decades ago. It's about time we learned it here as well.
        [-]
        tankenmate 7 months ago ago
        Is the juice worth the squeeze? Not everything needs Orange Book (DoD 5200.28-STD) Class B1 systems.
        [-]
        7 months ago ago
        [deleted]
      - immibis 7 months ago ago
        how will this prevent JavaScript from leaking my password manager database?
    - cenamus 7 months ago ago
      And if not, why did they introduce severe bugs for a tiny performance improvement?
      [-]
      - bloppe 7 months ago ago
        It's not tiny. Speculative execution usually makes code run 10-50% faster, depending on how many branches there are
        [-]
        bee_rider 7 months ago ago
        Yeah… folks who think this is just some easy to avoid thing should go look around and find the processor without branch prediction that they want to use.
        On the bright side, they will get to enjoy a much better music scene, because they’ll be visiting the 90’s.
        [-]
        yencabulator 7 months ago ago
        > Does Branch Privilege Injection affect non-Intel CPUs?
        > No. Our analysis has not found any issues on the evaluated AMD and ARM systems.
        wbl 7 months ago ago
        IBM Stretch had branch prediction. Pentium in the early 1990s had it. It's a huge win with any pipelining.
        titzer 7 months ago ago
        That's a vast underestimate. Putting in lfence before every branch is on the order of 10X slowdown.
        [-]
        grumbelbart2 7 months ago ago
        There is of course a slight chicken-egg-thing here: If there was no (dynamic) branch prediction, we (as in compilers) would emit different code that is faster for non-predicting CPUs (and presumably slower for predicting CPUs). That would mitigate a bit of that 10x.
        [-]
        anyfoo 7 months ago ago
        A bit. I think we've shown time and time again that letting the compiler do what the CPU is doing doesn't work out, most recently with Itanium.
        thesz 7 months ago ago
        The issue is with indirect branches. Most branches are direct ones.
        cenamus 7 months ago ago
        Of course I know that.
        But if the fix for this bug (how many security holes have ther been now in Intel CPUs? 10?) brings only a couple % performance loss, like most of the them so far, how can you even justify that at all? Isn't there a fundamental issue in there?
        autoexec 7 months ago ago
        How much improvement would there still be if we weren't so lazy when it comes to writing software. If we were working to get as much performance out of the machines as possible and avoiding useless bloat instead of just counting on the hardware to be "good enough" to handle the slowness with some grace.
      - umanwizard 7 months ago ago
        A modern processor pipeline is dozens of cycles deep. Without branch prediction, we would need to know the next instruction at all times before beginning to fetch it. So we couldn’t begin fetching anything until the current instruction is decoded and we know it’s not a branch or jump. Even more seriously, if it is a branch, we would need to stall the pipeline and not do anything until the instruction finishes executing and we know whether it’s taken or not (possibly dozens of cycles later, or hundreds if it depends on a memory access). Stalling for so many cycles on every branch is totally incompatible with any kind of modern performance. If you want a processor that works this way, buy a microcontroller.
        [-]
        tremon 7 months ago ago
        But branch prediction doesn't necessarily need complicated logic. If I remember correctly (it's been 20 years since I read any papers on it), the simple heuristic "all relative branches backwards are taken, but forward and absolute branches are not" could achieve 70-80% performance of the state-of-the-art implementations back then.
        [-]
        anyfoo 7 months ago ago
        Do you mean overall or localized to branch prediction? Assuming all of that is true, you're talking about a 20-30% performance hit?
        superblas 7 months ago ago
        > If you want a processor that works this way, buy a microcontroller.
        The ARM Cortex-R5F and Cortex-M7, to name a few, have branch predictors as well, for what it’s worth ;)
        jeffbee 7 months ago ago
        You can still have a static branch predictor. That has surprisingly good coverage. I'm not saying this is a great idea, just pointing it out.
- trebligdivad 7 months ago ago
  Thanks! It would be great if someone could update the title URL to that blog post; the press release is worse than useless.
  [-]
  - dang 7 months ago ago
    Ok, we've changed to that from https://ethz.ch/en/news-and-events/eth-news/news/2025/05/eth... above.
    [-]
    - alberto-m 7 months ago ago
      I don't know guys. Yes, the direct link saves a click, but the original title was more informative for the casual reader. I'm not a professional karma farmer and in dang's shoes I would have made the same adjustment, but I can't deny that seeing the upvote rate going down by 75% after the change was a little harsh.
      [-]
      - dang 7 months ago ago
        It was on the frontpage for 23 hours (and still is!) so the submission still did unusually well.
        I thought about adding the blog post link to the top text (a bit like in this thread: https://news.ycombinator.com/item?id=43936992), but https://news.ycombinator.com/item?id=43974971 was the top comment for most of the day, and that seemed sufficient.
        Edit: might as well belatedly do that!
        [-]
        tmtvl 7 months ago ago
        Thanks for all your hard work as always, dang.
        [-]
        computerthings 7 months ago ago
        [dead]
    - trebligdivad 7 months ago ago
      Thanks!
eigenform 7 months ago ago
Great read! Some boiled-down takeaways:
- Predictor updates may be deferred until sometime after a branch retires. Makes sense, otherwise I guess you'd expect that branches would take longer to retire!
- Dispatch-serializing instructions don't stall the pipeline for pending updates to predictor state. Also makes sense, considering you've already made a distinction between "committing the result of the branch instruction" and "committing the result of the prediction".
- Privilege-changing instructions don't stall the pipeline for pending updates either. Also makes sense, but only if you can guarantee that the privilege level is consistent between making/committing a prediction. Otherwise, you might be creating a situation where predictions generated by code in one privilege level may be committed to state used in a different one?
Maybe this is hard because "current privilege level" is not a single unambiguous thing in the pipeline?
mettamage 7 months ago ago
Good to see Kaveh Razavi, he used to teach at my uni in the Vrije Universiteit in Amsterdam :) The course Hardware Security was crazy cool and delved into stuff lijke this.
[-]
- markus_zhang 7 months ago ago
  I checked out this course (and another one from Vrije about malware) a couple of years ago, back then there was very little public info about the courses.
  Do you know if there is any official recording or notes online?
  Thanks in advance.
  [-]
  - thijsr 7 months ago ago
    As far as I am aware, the course material is not public. Practical assignments are an integral part of the courses given by the VUSEC group, and unfortunately those are difficult to do remotely without the course infrastructure.
    The Binary and Malware Analysis course that you mentioned builds on top of the book "Practical Binary Analysis" by Dennis Andriesse, so you could grab a copy of that if you are interested.
    [-]
    - mettamage 7 months ago ago
      Ah yea, he gave a guest lecture on how he hacked a botnet!
      More info here: https://krebsonsecurity.com/2014/06/operation-tovar-targets-...
      it's been a while back :)
    - markus_zhang 7 months ago ago
      Thanks. I understand that it is difficult to do it remotely.
      I do have the book! I bought it a while ago but did not have the pleasure to check it out.
  - mettamage 7 months ago ago
    No, but last time I checked you can be a contracted student for 1200 euro's.
    If I knew what I was getting into at the time, I'd do it. I did pay for extra, but in my case it was the low Dutch rate, so for me it was 400 euro's to follow hardware security, since I already graduated.
    But I can give a rough outline of what they taught. It has been years ago but here you go.
    Hardware security:
    * Flush/Reload
    * Cache eviction
    * Spectre
    * Rowhammer
    * Implement research paper
    * Read all kinds of research papers of our choosing (just use VUSEC as your seed and you'll be good to go)
    Binary & Malware Analysis:
    * Using IDA Pro to find the exact assembly line where the unpacker software we had to analyze unpacked its software fully into memory. Also we had to disable GDB debug protections. Something to do with ptrace and nopping some instructions out, if I recall correctly (look, I only low level programmed in my security courses and it was years ago - I'm a bit flabbergasted I remember the rough course outlines relatively well).
    * Being able to dump the unpacked binary program from memory onto disk. Understanding page alignment was rough. Because even if you got it, there were a few gotcha's. I've looked at so many hexdumps it was insane.
    * Taint analysis: watching user input "taint" other variables
    * Instrumenting a binary with Intel PIN
    * Cracking some program with Triton. I think Triton helped to instrument your binary with the help of Intel PIN by putting certain things (like xor's) into an SMT equation or something and you had this SMT/Z3 solver thingy and then you cracked it. I don't remember got a 6 out of 10 for this assignment, had a hard time cracking the real thing.
    Computer & Network Security:
    * Web securtiy: think XSS, CSRF, SQLi and reflected SQLi
    * Application security: see binary and malware analysis
    * Network security: we had to create our own packet sniffer and we enacted a Kevin Mitnick attack (it's an old school one) where we had to spoof our IP addresses, figure out the algorithm to create TCP packet numbers - all in the blind without feedback. Kevin in '97 I believe attacked the San Diego super computer (might be wrong about the details here). He noticed that the super computer S trusted a specific computer T. So the assignment was to spoof the address of T and pretend we were sending packets from that location. I think... writing this packet sniffer was my first C program. My prof. thought I was crazy that this was my first time writing C. I was, I also had 80 hours of time and motivation per week. So that helped.
    * Finding vulnerabilities in C programs. I remember: stack overflows, heap overflows and format strings bugs.
    -----
    For binary & malware analsys + computer & network security I highly recommend hackthebox.eu
    For hardware security, I haven't seen an alternative. To be fair, I'm not looking. I like to dive deep into security for a few months out of the year and then I can't stand it for a while.
    [-]
    - markus_zhang 7 months ago ago
      Wow, thanks a lot for the detailed answer. I'm going to see if I can register as a contracted student, but they probably do not accept remote students.
      BTW I can see you were very motivated back then. It got to be pretty steep but you managed to break through. Congrats!
      [-]
      - mettamage 7 months ago ago
        Remote won't work yea. It has to be in-person.
        > BTW I can see you were very motivated back then. It got to be pretty steep but you managed to break through. Congrats!
        Thanks! Yea I was :)
rakingleaves 7 months ago ago
Anyone know how this relates to the Training Solo attack that was just disclosed? https://www.vusec.net/projects/training-solo/
[-]
- hashstring 7 months ago ago
  Both exploit Spectre V2, but in different ways. My takeaway:
  Training Solo: - Enter the kernel (and switch privilege level) and “self train” to mispredict branches to a disclosure gadget, leak memory.
  Branch predictor race conditions: - Enter the kernel while your trained branch predictor updates are still in flight, causing the updates to be associated with the wrong privilege level. Again, use this to redirect a branch in the kernel to a disclosure gadget, leak memory.
rini17 7 months ago ago
If CPU brach predictor had bits of information readily available to check buffer boundaries and privilege level of the code, all this would be much easier to prevent. But apparently that will only happen when we pry out the void* from the cold C programmers' hands and start enriching our pointers with vital information.
[-]
- ajross 7 months ago ago
  I don't see how you think that will help? It's not about software abstraction, it's about hardware. Changing the "pointer" does nothing to the transistors.
  Doing what you want would essentially require a hardware architecture where every load/store has to go through some kind of "augmented address" that stores boundary information.
  Which is to say, you're asking for 80286 segmentation. We had that, it didn't do what you wanted. And the reason is that those segment descriptors need to be loaded by software that doesn't mess things up. And it doesn't, it's "just a pointer" to software and amenable to the same mistakes.
  [-]
  - nine_k 7 months ago ago
    Why stop at 80286, consider going back to the ideas of iAPX432, but with modern silicon tech and the ability to spend a few million transistors here and there.
    (CHERI already exists on ARM and RISC-V though.)
    [-]
    - ajross 7 months ago ago
      FWIW, the 286 launched like four months after the 432.
  - rini17 7 months ago ago
    286 far pointers were used sparingly, to save precious memory. Now we don't have any such problem and there are still unused bits in pointers even on largest 64 bit systems that might be repurposed perhaps. With virtual memory, there are all kinds of hardware supported address mappings and translations and IOMMU already so adding more transistors isn't an issue. The issue is purely cultural as you have just shown, people can't imagine it.
    [-]
    - ajross 7 months ago ago
      That's misunderstanding the hardware. All memory access on a 286 was through a segment descriptor, every access done in protected mode was checked against the segment limit. Every single one.
      A "far pointer" was, again, a *software* concept where you could tell the compiler that this particular pointer needed to use a different descriptor than the one the toolchain assumed (by convention!) was loaded in DS or SS.
  - nottorp 7 months ago ago
    I suppose a CPU that only runs Rust p-code is what the OP is dreaming about...
    [-]
    - ajross 7 months ago ago
      Generated rust "p-code" would presumably be isomorphic to LLVM IR, which doesn't have this behavior either and would be subject to the same exploits.
      Again, it's just not a software problem. In the real world we have hardware that exposes "memory" to running instructions as a linear array of numbers with sequential addresses. As long as that's how it works, you can demand an out of bounds address (because the "bounds" are a semantic thing and not a hardware thing).
      It is possible to change that basic design principle (again, x86 segmentation being a good example), but it's a whole lot more involved than just "Rust Will Fix All The Things".
      [-]
      - nottorp 7 months ago ago
        Holy... I need to stop making fun of Rust (*). I keep getting misinterpreted.
        (*) ... although I don't think I can abstain ...
- quotemstr 7 months ago ago
  You want CHERI.
- ActorNightly 7 months ago ago
  Or people could just understand the scope of the issue better, and realize that just because something has a vulnerability doesn't mean there is a direct line to an attack.
  In the case of speculative execution, you need an insane amount of prep to use that exploit to actually do something. The only real way this could ever be used is if you have direct access to the computer where you can run low level code. Its not like you can write JS code with this that runs on browsers that lets you leak arbitrary secrets.
  And in the case of systems that are valuable enough to exploit with a risk of a dedicated private or state funded group doing the necessary research and targeting, there should be a system that doesn't allow unauthorized arbitrary code to run in the first place.
  I personally disable all the mitigations because performance boost is actually noticeable.
  [-]
  - vlovich123 7 months ago ago
    > Its not like you can write JS code with this that runs on browsers that lets you leak arbitrary secrets
    That's precisely what Spectre and Meltdown were though. It's unclear whether this attack would work in modern browsers but they did reenable SharedArrayBuffer & it's unclear if the existing mitigations for Spectre/Meltdown stimy this attack.
    > I personally disable all the mitigations because performance boost is actually noticeable.
    Congratulations, you are probably susceptible to JS code reading crypto keys on your machine.
    [-]
    - nine_k 7 months ago ago
      Disabling some mitigations makes sense for an internal box that does not run arbitrary code from the internet, like a build server, or a load balancer, or maybe even a stateless API-serving box, as long as it's not a VM on a physical machine shared with other tenants.
      [-]
      - anyfoo 7 months ago ago
        You run "arbitrary code from the internet" as soon as you use a web browser with JS enabled.
        [-]
        nine_k 7 months ago ago
        This is exactly what you won't do on most of your infrastructure boxes, would you? If you can reasonably trust all the software on the whole box, many mitigations that protect against effects of running adversary code on your machine become superfluous.
        OTOH if an adversary gets a low-privilege RCE on your box, exploiting something like Spectre or RowHammer could help elevate the privilege level, and more easily mount an attack on your other infrastructure.
        [-]
        anyfoo 7 months ago ago
        Yeah, as stated in a sibling answer, I misread your comment a little bit. It's true, on at least some classes of infrastructure boxes, you more or less "own all that is on the machine" anyway.
        But also note my caveat about database servers, for example. A database server shared between accounts of different trust levels will be affected, if the database supports stored procedures for example. Basically, as soon as there's anything on the box that not all users of it should be able to access anyway, you'll have to be very, very careful.
        [-]
        vlovich123 7 months ago ago
        While that’s an interesting idea, I’m not sure a side channel attack is actually exploitable by a stored procedure as I don’t believe it has enough gadgets.
        [-]
        anyfoo 7 months ago ago
        I don't know. PL/SQL (which is separate from SQL) is effectively a general purpose language, and kind of a beast at that. I have not the faintest idea, but at least I wouldn't be surprised to see high enough precision timers, and maybe it even getting JITted down into machine code for performance nowadays. (And I've read that tight loops can be used for timing in side channel attacks as well, although I assume it requires a lot more knowledge about the device you're running on.)
        A quick search reveals that there is at least a timer mechanism, but I have no idea of any of its properties: https://docs.oracle.com/en/database/oracle/oracle-database/1...
        But what I'm actually trying to say, is: For multiple intents and purposes (which might or might not include relevance to this specific vulnerability), as soon as you allow stored procedures in your database, "not running arbitrary code" is not a generally true statement instead.
        ActorNightly 7 months ago ago
        You need some lowish level programming primitives to execute side chain attacks. For example, you can't do cache timing with SQL.
        [-]
        anyfoo 7 months ago ago
        PL/SQL, not SQL. Whatever I knew about PL/SQL in the 90s and early 2000s I've forgotten, but I wouldn't be so certain that PL/SQL a) does not have precise enough timing primitives, and b) does not get JITed down into machine code nowadays. It is a fully fledged, turing complete programming language with loops, arrays etc.
        vlovich123 7 months ago ago
        What infrastructure box are you running that is running 100% all your code? Unless you ignore supply chain attacks, you’ve always got exposure.
        [-]
        ActorNightly 7 months ago ago
        Excluding hardware supply chain attack, you start with a secure linux distro that is signed, and then the code that you write basically is written from scratch, using only the core libraries.
        I got really good a CS because I used to work for a contractor in a SCIF where we counldn't bring in any external packages, so I basically had to write C code for things like web servers from scratch.
        dwattttt 7 months ago ago
        Or with JS disabled. HTML isn't as expressive, but it's still "arbitrary code from the internet"
        [-]
        anyfoo 7 months ago ago
        There is a difference. JS is turing complete, pure HTML is far from (as far as I'm aware). So HTML might (!) well be restricted enough to not be able to carry out such an attack.
        But I'd never state to definitively, as I don't know enough about what HTML without JS can do these days. For all I know there's a turing tarpit in there somewhere...
        [-]
        nine_k 7 months ago ago
        CSS3 is Turing-complete, but creating an exploit using just it would be... quite a feat.
        With JS or WASM, it's much more straightforward.
        vlovich123 7 months ago ago
        HTML doesn’t have the potential to deliver Spectre like attacks because:
        1. No timers - timers are generally a required gadget & often they need to be hires or building a suitable timing gadget gets harder & your bandwidth of the attack goes down
        2. No loops - you have to do timing stuff in a loop to exploit bugs in the predictor.
        baobun 7 months ago ago
        Which you wouldn't do on an internal load balancer or database server, right?
        [-]
        anyfoo 7 months ago ago
        You are right, I sort of misread the statement I was replying to, but also wanted to reinforce that the large class of personal desktop machines is still very much affected, even if you "think" that you don't run "arbitrary code" on your machine.
        By the way, you have to be careful on your database server to not actually run arbitrary code as well. If your database supports stored procedures (think PL/SQL), that qualifies, if the clients that are able to create the stored procedures are not supposed to be able to access all data on that server anyway.
        [-]
        baobun 7 months ago ago
        Oh yeah. Supply-chain risk is still a thing too and defense-in-depth is not a bad strategy.
        Physical isolation simplifies a lot of this. This class of attacks isn't (as) relevant for single-tenant single-workload dedicated machines.
        [-]
        vlovich123 7 months ago ago
        Based on this thread, I think people badly misjudge what “single-tenant” means in the context of susceptibility to exploits.
        [-]
        baobun 7 months ago ago
        Mind elaborating?
        [-]
        vlovich123 7 months ago ago
        Your “infrastructure” server could be a CI server - it’s just building “my” code ignoring that many (all?) build systems allow execution of arbitrary code as part of the build process (rust, cmake, bazel, JS ecosystem, Go, etc etc) and many involve 3p dependencies. And CI servers often handle secrets to infrastructure (publishing packages, etc). So you could end up allowing a supply chain attack that reads out various API keys & whatnot.
        In other words, properly drawing the boundary around “this is safe with meltdown disabled” is very hard, non-intuitive, and you’re one configuration/SW change or a violated assumption away from a Meltdown attack which is cross-process memory access & one notch below remote access. There’s a reason you design for security in depth rather than trying to carefully build a jenga tower where you’re one falling block away from total compromise.
    - ActorNightly 7 months ago ago
      >Congratulations, you are probably susceptible to JS code reading crypto keys on your machine.
      No wonder you guys are scared AI is going to take your job lol.
      Thats not how it works at all. To grab a key stored in a JS variable, the following would need to happen
      1. Attacker needs to find a way to inject arbitrary JS code in a website, which means controlling either an iframe that is loaded or some component. This is a pretty hard thing to do these days with Same-Site strictness
      2. The code needs to know specifically what memory address to target. When things like JWT or other tokens are stored in session or local storage, the variable name usually contains a random string. Injected code will have to figure out a way to find what that variable name is.
      3. For attack to work, the cache has to get evicted. This is highly processor specific on how well it works, and also, the web app has to be in a state where no other process is referencing that variable. With JS, you also have to infer memory layout (https://security.googleblog.com/2021/03/a-spectre-proof-of-c...) first, which takes time. Then you have to train the branch predictor, which also takes time.
      So basically, I have a statistically higher chance of losing my keys to someone who physically robs me rather than a cache timing attack.
      Generally when an exploit like this drops, people always have failures to update their systems, and you see it being used in the wild. With Spectre/Meltdown, this didn't really happen, because of the nature of how these attacks work and the difficulty of getting the cache timing code to work correctly without specific targeting of a processor and ability to execute arbitrary code on the machine.
      [-]
      - anyfoo 7 months ago ago
        This seems to theorize an attack where you are interested in particularly data of the particularly visited website, and at the same time assuming that the attack would have to be carried out on the same website.
        The vulnerability however allows arbitrary reading of any memory in the system in at least some circumstances, the presented PoC (https://www.youtube.com/watch?v=jrsOvaN7PaA ) demonstrates this by literally searching memory for the system's /etc/shadow and dumping that.
        Whether the attack is practical using JS instead of a compiled C program is unknown to me, but if it is, it's not clear to me why the attacker would need to inject JS code into other websites or know what addresses to target. (If it is not, the question is moot.)
        [-]
        ActorNightly 7 months ago ago
        >The vulnerability however allows arbitrary reading of any memory in the system in at least some circumstances, the presented PoC
        The PoC uses compiled C code. I hope I don't have to explain the difference between C code that runs on the system versus JS code that runs in the context of the browser...
        [-]
        anyfoo 7 months ago ago
        Well that only depends on what gadgets happen to be available, doesn't it? Both C and JS get compiled down to machine code.
        I personally would not trust that you couldn't, in the most extreme case, get close enough to the kernel (like the PoC does through system calls) to mispredict it into leaking any kernel-mapped memory through a timing side channel. And nowadays, kernels typically map almost all physical memory in their shared address space (it's not too expensive in a 64 bit address space).
        EDIT: See my extended reasoning here: https://news.ycombinator.com/item?id=43991696
        [-]
        ActorNightly 7 months ago ago
        There is no gadget in JS that lets you access arbitrary memory address in the system. You can create an array and then access it past bounds in the sense that branch predictor will execute this code and in theory load the address to cache, but the memory start of that array is going to be arbitrary. The JS engine doesn't allow you (even in web assembly) to access raw memory by value, and there is always a translation layer.
        [-]
        anyfoo 7 months ago ago
        Again, I don't think I understood yet why the JS code needs to create actual pointers accessing arbitrary memory for the attack to work, instead of benignly passing down arbitrary integer values far enough into (say) the kernel and mispredicting into code that would use these arbitrary values to be dereferenced as pointers, elaborated here: https://news.ycombinator.com/item?id=43991973
        [-]
        ActorNightly 7 months ago ago
        Because you have no way of computing how "far" you need past the array length to access, because you have no idea where the first value is in memory, and you can read backwards from the memory location assigned to you by the JS engine. So if you get INSANELY lucky and make cache eviction of arbitrary memory addresses work, and you can get around other applications accessing the memory and putting values back in the cache, you are still left with a bunch of random hex values, and you have no idea where the key is, or even if its in those values (in case the memory of the target process is "behind" chrome)
        With C code, you can pretty much reference any memory location, so you can make things work.
        [-]
        cdman 7 months ago ago
        Spectre was shown to be exploitable from Javascipt: https://www.zdnet.com/article/google-this-spectre-proof-of-c... - making the bet that this won't be shown the same is not a safe wager I would say :) (especially that Javascript also includes stuff like WebAssembly).
        7 months ago ago
        [deleted]
    - gblargg 7 months ago ago
      Who these days would trust crypto keys on their machine, given the many hardware wallets available?
      [-]
      - vlovich123 7 months ago ago
        Where do you think the crypto keys for the TLS connection securing your HTTPS browsing are stored? Although from what you said I’m now thinking you’re referring to cryptocurrency & thus aren’t on the same wavelength of the discussion here. Crypto keys —> cryptography keys, not cryptocurrency keys.
      - ActorNightly 7 months ago ago
        I think he means crypto as in like tokens, not wallet keys.
  - anyfoo 7 months ago ago
    > Or people could just understand the scope of the issue better
    Do you understand the scope of the issue? Do you know that this couldn't personally affect you in a dragnet (so, not targeted, but spread out, think opportunistic ransomware) attack?
    Because this statement of yours:
    > Its not like you can write JS code with this that runs on browsers that lets you leak arbitrary secrets.
    was not true for Spectre. The original spectre paper notoriously mentions JS as an attack vector.
    If you truly disable all mitigations (assuming CPU and OS allow you to do so), you will reopen that hole.
    So:
    > The only real way this could ever be used is if you have direct access to the computer where you can run low level code.
    I'm a low level kernel engineer, and I don't know this to be true in the general case. JITs, i.e. the JavaScript ones, also generate "low level code". How do you know of this not being sufficient?
    [-]
    - ActorNightly 7 months ago ago
      >Do you understand the scope of the issue? Do you know that this couldn't personally affect you in a dragnet
      The issue is not whether or not it could affect me, the issue is what is the risk. And I can say for certain that the risk is very low, because I seem to have more understanding of the space.
      >The original spectre paper notoriously mentions JS as an attack vector.
      In an analogy, having an attack vector is having a certain type of weapon, while executing a full exploit end to end is on the scope of waging a war. Sure, a right person at the right place with that weapon can take out a critical target and win the war, but just having that weapon doesn't guarantee you winning a war.
      In the cases of certain exploits, like Log4Shell, thats like having a portable shotgun that shoots hypersonic missiles in a scatter pattern. Log4Shell basically means that if anything gets logged, even an error message, that can be used to execute arbitrary code, and its super easy to check if this is the case - send payloads to all services with a JNI url that you control and see what pops up, and boom, you can have shells on those computers.
      In the case of Spectre/Meltdown, its like having a specific type of booby trap. Whether or not you can actually set up that booby trap highly depends on environment. If a website is fully protected against code injection, then executing JS cache timing would be impossible. And even if it wasn't, there would be other
      Of course nothing is ever for certain. For example, browsers can contain some crazy logic bug that bypasses Same-Origin checks that nobody has found yet. But the chance of this happening is extremely low, as browser code is public.
      [-]
      - anyfoo 7 months ago ago
        Hmm, I'm not sure why Same-Origin and injection attacks are prerequisite. Shouldn't it be sufficient to visit an arbitrary website through a link somewhere?
        [-]
        ActorNightly 7 months ago ago
        It would be restricted to stealing secrets the website itself places, considering there isn't mapped js variable with to data from another website.
        [-]
        anyfoo 7 months ago ago
        This vulnerability is, in the worst case, about reading any memory in the system, not memory confined to any particular website, or to the browser at all, though?
        [-]
        ActorNightly 7 months ago ago
        Not quite. This vulnerability is reading memory that you can directly address.
        If you can run arbitrary machine code on a system, that memory is the entire memory space (in theory) - you can assign a value to any pointer and attempt to read that address through side channel attack.
        In reality the task is much harder - you don't know where in memory the thing you want is because of ASLR, virtual memory maps, and other factors, and to exploit cache timing attacks you need to have cache eviction happen first, and that's not really that straight forward for some memory addresses.
        Javascript that runs in browser on the other hand has a lot more restrictions. You can't dereference a pointer to an arbitrary memory address in JS, you need an existing variable in the current context that is mapped to some memory.
        [-]
        anyfoo 7 months ago ago
        I am really an amateur when it comes to Spectre-like attacks, but do you strictly need a valid pointer pointing to the address? I thought you "just" need to mispredict into code that would use it as a pointer, even if that code is never actually reached?
        The paper demonstrates this by the C PoC using a system call as a gadget. Any value can be passed into the system call before it gets checked for validity on the other side of the kernel boundary. In their example, they use the "buffer" and "buflen" arguments to the keyctl system call, which results the values passed into the system call being in the registers r12 and r13. Then, they mispredict into a disclosure gadget that uses r12 and r13 for dereferencing pointers:
        movzx edx, byte ptr [r12] mov rbx, qword ptr [r13 + rdx*8]
        Note how "buflen" isn't even a pointer (for keyctl) to begin with, but the (as far as I understand) unrelated disclosure gadget code dereferences r13 (because it treats it as a pointer), and they managed to mispredict into it through keyctl's call to the "read" function pointer (this is the part where it's still a bit fuzzy to me, as I unfortunately don't fully understand the misprediction itself and how they control for arbitrary destinations).
        Now, obviously you can't directly make system calls through JS. But I don't understand yet what, if anything, is in place to absolutely make sure that there are no indirect ways that result in a system call (or another path!) where benign, but arbitrary values get passed as arguments in registers, executing benign code, but being mispredicted into a different kernel code path where those registers would be used as pointers.
        And then, once you can do that, you can affect almost arbitrary physical memory, since typically almost all physical memory is mapped in the kernel's address space.
        Sure, this is much harder because of the layers in between, but I still don't quite understand why it's impossible, and why a sufficiently motivated attacker might not eventually find a workable solution?
        Spectre just seems so fundamentally catastrophic for me, that anything but proper hardware fixes to how privilege boundaries are honored by speculative execution seems to merely make things harder to me, but how hard is a very non-trivial question for me. Is it hard enough?
        (As for ASLR, in their paper they break that as their first step using their own methods.)
        [-]
        anyfoo 7 months ago ago
        Reading the paper further, there is this:
        However, the BTB provides partial target addresses [28], so the attacker only needs to branch to an address where the lower portion matches the desired kernel target. The upper bits of the BTB target are provided by the victim branch source, which will be in the kernel address range. The technique follows the one we used in Section 6.1 and Figure 5.
        So it seems to me that the actual difficulty from JS is less passing down the desired memory destinations (that's harder, yes, but I wonder if it's hard enough), but to generate (benign) branches to almost arbitrary addresses within the JS code, as it's probably neigh impossible to control for where those branches go.
        Still, who really knows if there isn't some jump table generator or whatever to allow an attacker to generate branch targets arbitrarily enough (remember that it's not necessary to branch to the full address to train the branch predictor).
        Because this would not be a vulnerability in any sense by itself. It would be yet another completely benign but unlucky piece of code that just allows the tire fire that Spectre is to be leveraged.
        I'm probably missing other relevant aspects.
        As for cache flushing, I think that's what the disclosure gadget does: "The disclosure gadget needs to use the two attacker-controlled registers to leak and transmit the secret via Flush+Reload", so that's also kernel code which we mispredict into. But I'm not totally sure.
        [-]
        ActorNightly 7 months ago ago
        You need to flush the addresses out of the cache in order for the branch predictor to speculatively execute and load the address back into the cache. This is where things get very tricky, because lets say you have some other process that accesses that address on a regular basis - it will get reloaded into the cache so your timing attacks have a lower chance of success.
        So overall, putting together an exploit with this through JS becomes a matter of lots and lots of research and testing, for a specific target - i.e not worth the effort for anyone but a state sponsored agency.
layer8 7 months ago ago
Intel security advisory: https://www.intel.com/content/www/us/en/security-center/advi...
rtkwe 7 months ago ago
I wonder if there's similar gaps in AMD hardware? Seems like speculative execution is simply an extremely hard to patch vulnerability in a share processor space so I wonder how AMD has avoided it.
[-]
- tmoertel 7 months ago ago
  According to the authors' blog post:
  > Does Branch Privilege Injection affect non-Intel CPUs?
  > No. Our analysis has not found any issues on the evaluated AMD and ARM systems.
  Source: https://comsec.ethz.ch/research/microarch/branch-privilege-i...
  [-]
  - __turbobrew__ 7 months ago ago
    Intel is getting kicked while it is down.
- pdpi 7 months ago ago
  The short of it is that AMD haven’t “avoided it”. Speculative execution side channels aren’t one vulnerability but rather a whole family of vulnerabilities. This particular one is (apparently) Intel-only, same as Meltdown was, but AMD was also vulnerable to the original Spectre.
- bee_rider 7 months ago ago
  Pedantically, speculative execution isn’t the vulnerability, it is a necessary mechanism for every high-performance CPU nowadays (where “nowadays” started, like, around the turn of the century). However, bugs and vulnerabilities in speculative execution engines are very widespread because they are complicated.
  There are probably similar bugs in AMD and ARM, I mean how long did these bugs sit undiscovered in Intel, right?
  Unfortunately the only real fix is to recognize that you can’t isolate code running on a modern system, which would be devastating to some really rich companies’ business models.
  [-]
  - fc417fc802 7 months ago ago
    > the only real fix is to recognize that you can’t isolate code running on a modern system
    Does pinning VMs to hardware cores (including any SMT'd multiples) fix this particular instance? My understanding was that doing that addressed many of the modern side channel exploits.
    Of course that's not ideal, but it's not too bad in an era where the core count of high end CPUs continues to creep upwards.
    [-]
    - vlovich123 7 months ago ago
      If this allows reading kernel memory, then your VMs could read the host kernel anyway & any security keys contained therein (& that’s assuming pinning cores limits the exploit to memory being accessed by other CPUs on the same core which generally has not been true of side channel attacks as far as I’m aware).
    - everfrustrated 7 months ago ago
      Possibly not. Seems like this exploit allows walking memory which would be shared?
      [-]
      - fc417fc802 7 months ago ago
        But can you exfiltrate memory without it being accessed on the core you're running on? I thought branch predictors were a one-per-physical-core sort of thing and that this class of side channel attack leaked something being done on the same core in a different privilege domain.
  - rtkwe 7 months ago ago
    I meant that it's a feature that's hard to implement in a way that delivers the performance gains without creating vulnerabilities like this one.
- quotemstr 7 months ago ago
  The solution to this particular vulnerability is intuitive to me: snapshot the current privilege level when we enqueue a branch predictor update and carry that snapshot along with the update itself as it flows through the processor's internal buffers. Same problem you might have in software and the same solution, yes?
  [-]
  - wbl 7 months ago ago
    That actually doesn't work. The evaluation of the branch condition may be at some point far away from where the privilege update is recognized and executed. There is no current state to update, it's only recognized in retrospect what the state was. And carrying along data is pricey in a CPU: the instruction pointer isn't even available because of this.
    You could say we only update the predictor at retirement to solve this. But that can get a little dicy also: the retirement queue would have to track this locally and retirement frees up registers, better be sure it's not the one your jump needs to read. Doable but slightly harder than you might think.
smartmic 7 months ago ago
> Closing these sorts of gaps requires a special update to the processor’s microcode. This can be done via a BIOS or operating system update and should therefore be installed on our PCs in one of the latest cumulative updates from Windows.
Why mention only Windows, what about Linux users?
[-]
- ajross 7 months ago ago
  Intel distributes microcode updates for Linux here: https://github.com/intel/Intel-Linux-Processor-Microcode-Dat... , and the distro are all set up to pull from there and distribute automatically.
  Not expert enough to know what to look for to see if these particular mitigations are present yet.
  [-]
  - yencabulator 7 months ago ago
    Researchers say this is CVE-2024-45332.
    INTEL-SA-01247 covers that CVE.
    Microcode release 20250512 has that INTEL-SA mitigated.
    https://github.com/intel/Intel-Linux-Processor-Microcode-Dat...
    https://www.intel.com/content/www/us/en/security-center/advi...
  - brokenmachine 7 months ago ago
    On an Ubuntu 24.04.2 machine:
```
   dpkg -l | grep microcode

   ii  amd64-microcode 3.20231019.1ubuntu2.1 amd64 Processor microcode firmware for AMD CPUs
   ii  intel-microcode 3.20250211.0ubuntu0.24.04.1 amd64 Processor microcode firmware for Intel CPUs
   ii  iucode-tool 2.3.1-3build1 amd64 Intel processor microcode tool
```
- matja 7 months ago ago
  The Linux kernel has had microcode loading support (`CONFIG_MICROCODE` / `CONFIG_MICROCODE_INTEL`) but many years, but it does require that Intel release the microcode files necessary for distribution maintainers to update the packages, then it should be included in a system update.
margorczynski 7 months ago ago
I wonder if there's any way to recover for Intel. They don't have anything worthwhile on the market, R&D takes a lot of time and their foundries are a constant source of losses as they're inferior compared to the competition.
On top of that x86 seems to be pushed out more and more by ARM hardware and now increasingly RISC-V from China. But of course there's the US chip angle - will the US, especially after the problems during Covid, let a key manufacturer like Intel bite the dust?
[-]
- chneu 7 months ago ago
  Intel really isn't in as much trouble as tech blogs like to act.
  It's not great but lol the sensationalism is hilarious.
  Remember, gamers only make up a few percentage of users for what Intel makes. But that's what you hear about the most. One or two data center orders are larger than all the gaming cpus Intel will sell in a year. And Intel is still doing fine in the data center market.
  Add in that Intel still dominates the business laptop market which is, again, larger than the gamer market by a pretty wide margin.
  [-]
  - WaxProlix 7 months ago ago
    You're right about gamers, but other verticals are looking bad for Intel, too.
    The two areas you mention (data center, integrated OEM/mobile) are the two that are most supply chain and business-lead dependent. They center around reliable deliveries of capable products at scale, hardware certifications, IT department training, and organizational bureaucracy that Intel has had captured for a long time.
    But!
    Data center specifically is getting hit hard from AMD in the x86 world and ARM on the other side. AWS's move to Graviton alone represents a massive dip in Intel market share, and it's not the only game in town.
    Apple is continuing to succeed in the professional workspace, and AMD's share of laptop and OEM contracts just keeps going up. Once an IT department or their chosen vendor has retooled to support non-Intel, that toothpaste is not going back into the tube - not fully, at least.
    For both of these, AMD's improvement in reliability and delivery at scale will be bearing fruit for the next decade (at Intel's expense), and the mindshare, which gamers and tech sensationalism are indicators for, has already shifted the market away from an Intel-dominated world to a much more competitive one. Intel will have to truly compete in that market. Intel has stayed competitive in a price-to-performance sense by undermining their own bottom line, but that lever only has so far it can be pulled.
    So I'm not super bullish on Intel, sensationalism aside. They have a ton of momentum, but will need to make use of it ASAP, and they haven't shown an ability to do that so far.
- layer8 7 months ago ago
  Intel still has well over 70% x86 market share. They have a long runway. Arm had only 15% datacenter market share last year, and still hasn’t made much headway in the Windows market.
  [-]
  - drob518 7 months ago ago
    The trend toward ARM in both laptops and data centers is clear. It’s being driven by power efficiency as much as performance. The x86 guys have shown that they can make x86 fast and that CISC is really not an issue, but that takes a lot of transistors and those transistors inevitably burn power. For the same performance, x86 will always be more power hungry. And so the industry will keep moving toward ARM and RISC-V.
  - freeone3000 7 months ago ago
    Arm is making huge gains though — five years ago they had less than 5%. The future of x86 is not bright.
    [-]
    - baq 7 months ago ago
      x86 vs arm doesn’t matter. Hardware matters. Intel needs to make the best cpu again. It can be x86, it can be arm, it can be risc-v.
      [-]
      - adgjlsfhk1 7 months ago ago
        Arm vs x86 matters a lot for Intel since they don't make Arm CPUs. x86 used to be a massive moat for Intel/AMD. The rise of ARM market-share means that that moat is draining. 10 years ago, AMD and IBM were the only competition (and they were both in rough shape). Now Intel is competing against AMD, NVidia, Qualcom, Amazon, and Arm. Even if Intel can make the best CPU again, they no longer can charge monopoly prices for it. If you have a 10% faster CPU, that only lets you charge a small premium over everyone else.
        [-]
        genewitch 7 months ago ago
        There's an Arm cpu in every Intel CPU in ring 0 or -1, it boots a modified MINIX. So Intel knows a bit about ARM.
        [-]
        yencabulator 7 months ago ago
        Wikipedia says the Intel ME has been an x86 since Skylake (2015), and before that it was ARC not ARM. AMD is the one using an ARM core for that functionality.
        https://en.wikipedia.org/wiki/Intel_Management_Engine
        https://en.wikipedia.org/wiki/AMD_Platform_Security_Processo...
        [-]
        genewitch 7 months ago ago
        i don't think the manufacturers would share what arch their deeply embedded cores are. Christopher Domas was the first person to interact with what he calls ring -4 to escalate from ring 3 to ring 0. the processors were old, and the ring -4 is not x86. I'm looking at a slide from 2019 that says that IME is a physically separate, non-x86 processor that boots minix
        Now, i may be misremembering and i don't have time today to download all his talks and grep the .vtt for "ARM"; however, my memory is reinforced by literally 30 seconds of internet searches. i bought the Minix book because of one of the presentations.
        i'm not doing any more research for free on this. Even if it isn't ARM, it isn't x86.
        [-]
        yencabulator 7 months ago ago
        Uh-huh. Meanwhile, people reverse engineering the firmware are repeatedly saying it's x86. For example, https://puri.sm/posts/deep-dive-into-intel-me-disablement/
        Are you perhaps looking at some slides from Cyber@UC Meeting 81, held Jan 16 2019, located at https://www.cyberatuc.org/files/slides/meeting_081.pdf which clearly say
        > Physically separate processor embedded within the x86 processor that runs a custom MINIX image
        and misreading that as saying more than what it does?
        And those slides link to more resources saying it's an x86.
        [-]
        genewitch 7 months ago ago
        fine, you win. every bit of the CPU in an intel CPU is 100% x86. nevermind that the thing i am talking about is more "deeply embedded" than the management engine, has access to all registers, etc. oh and is specified to be RISC. I guess technically x86 is RISC, so...
        you win.
        [-]
        yencabulator 7 months ago ago
        Also running MINIX, and being physically separate? That's what you tried to quote as your proof.
        And somehow this processor would also not be on the target list for Coreboot/Libreboot/Purism/Google people trying to de-ME their hardware?
        Mr. Occam says I have very little reason to trust your recall/judgement, at this time.
        [-]
        genewitch 7 months ago ago
        preface: you brought up IME which isn't what i was talking about. that's ring -3. The thing i am talking about is either adjacent or "above" that in the hierarchy. I was not, and never spoke of the ME. i quoted the "physically separate" part for a reason, although if prodded, i couldn't have told you at the time. it isn't on the CPU die. anyhow:
        It's funny that i knew about the minix even though according to your sources that wasn't what was running on the x86 chips until after they removed the RISC embedded cpu and switched to "x86." i've looked at your wiki link and followed the footnote, to an archive.org page where it is merely claimed that it is "now x86" and "running minix 3".
        So we're at an impasse. I'm not downloading a bunch of youtube .vtt files and you've linked as authoritative sources as i have at this point; "someone said so."
        that is: wiki cites the ptsecurity blogpost from august 2017 as the source for the claim that it is x86. furthermore, the blogpost claims that the architecture is "lakemont" which is 32nm, but the blog claims it's 22nm. Further, it claims it's specifically the quark, which was discontinued in 2019. i understand they can use the IP in the toolchain to put that on the main die, as well as build that part of the die at a larger size. However, there are a few other assertions that appear in there (in the code listings) that appear nowhere else on the internet.
        oh, and ask mister occam if a physically separate chip (the Intel PCH 100 and up) counts as "embedded in the intel CPU" which is what i've been saying (ring -1, ring -4 are all on the physical die of the CPU.)
        since we like wiki so much https://en.wikipedia.org/wiki/Platform_Controller_Hub that's where the ME is, per your link, first paragraph: It is located in the Platform Controller Hub of modern Intel motherboards.
        I knew this was a waste of time, and now i spent an hour digging through crap that makes my eyes bleed like https://www.intel.com/content/www/us/en/content-details/3326...
        you're talking about a completely separate chip, and that was a red herring. I'm pretty annoyed at myself right now.
        tremon 7 months ago ago
        More than a bit, Intel actually produced ARM processors for a decade. The XScale line of processors was sold to Marvell in 2006, and that knowledge has probably atrophied since then. Intel used to build their network interface cards around an XScale core, not sure what they're using now.
- porridgeraisin 7 months ago ago
  I guess it depends on your expectations. Will they be fine as a company? I think yes. Will they be as prominent as they were at different points in their history? I think not.
  Product aside, from a shareholder/business point of view (I like to think of this separately these days as financial performance is becoming less and less reflective of the end product) I think they are too big to fail.
- emkoemko 7 months ago ago
  didn't i read something about apple,nvidia and other companies looking to use their foundries? why would they do that if its inferior or was that something else?
  [-]
  - greenavocado 7 months ago ago
    Because there's nothing else in America
yonatan8070 7 months ago ago
Just to make sure I got this right, at this point in time there are patches out for all major operating systems that can mitigate this/apply relevant microcode to mitigate it?
[-]
- dboreham 7 months ago ago
  Yes. Embargo date May 13 (today).
HeliumHydride 7 months ago ago
https://scholar.harvard.edu/files/mickens/files/theslowwinte...
"Unfortunately for John, the branches made a pact with Satan and quantum mechanics [...] In exchange for their last remaining bits of entropy, the branches cast evil spells on future generations of processors. Those evil spells had names like “scaling-induced voltage leaks” and “increasing levels of waste heat” [...] the branches, those vanquished foes from long ago, would have the last laugh."
[-]
- Hackbraten 7 months ago ago
  I love James Mickens!
  https://www.usenix.org/system/files/1401_08-12_mickens.pdf
  > The Mossad is not intimidated by the fact that you employ https://. If the Mossad wants your data, they’re going to use a drone to replace your cellphone with a piece of uranium that’s shaped like a cellphone, and when you die of tumors filled with tumors, […] they’re going to buy all of your stuff at your estate sale so that they can directly look at the photos of your vacation instead of reading your insipid emails about them.
  [-]
  - wood_spirit 7 months ago ago
    So this is where they got the pager and walkie talkie ideas from
    [-]
    - genewitch 7 months ago ago
      You know, I didn't really think of this till your comment: this was a vast conspiracy spanning years. You always hear, when discussions of "conspiracies" happen, things like "it would involve too many people, too many moving parts" like opsec would be impossible. And then you have the pagers.
      Kinda like the old chestnut that rich people are only rich on paper and then, Musk buys twitter. Not tesla, or some DBA, Musk.
      This decade might actually be the season of reveal.
      [-]
      - angra_mainyu 7 months ago ago
        I think this is a human thing, being very myopic given our short lifespans, we forget even relatively recent things.
        The Cold War for example was full of these intricate, complex and stunning feats of spycraft that they'd pull off on each other.
- btown 7 months ago ago
  This is absolute gold!
  > “Making processors faster is increasingly difficult,” John thought, “but maybe people won’t notice if I give them more processors.” This, of course, was a variant of the notorious Zubotov Gambit, named after the Soviet-era car manufacturer who abandoned its attempts to make its cars not explode, and instead offered customers two Zubotovs for the price of one, under the assumption that having two occasionally combustible items will distract you from the fact that both items are still occasionally combustible.
  > Formerly the life of the party, John now resembled the scraggly, one-eyed wizard in a fantasy novel who constantly warns the protagonist about the variety of things that can lead to monocular bescragglement.
  And in 2013 the below would have been correct, but we live in a very different world now:
  > John’s massive parallelism strategy assumed that lay people use their computers to simulate hurricanes, decode monkey genomes, and otherwise multiply vast, unfathomably dimensioned matrices in a desperate attempt to unlock eigenvectors whose desolate grandeur could only be imagined by Edgar Allen Poe. Of course, lay people do not actually spend their time trying to invert massive hash values while rendering nine copies of the Avatar planet in 1080p.
  He wasn't too far off about the monkeys, though...
- bee_rider 7 months ago ago
  The bit about vast matrices shows some silver lining though; it turns out John’s little brother figured out how to teach those matrices to talk like a person.
  [-]
  - yvdriess 7 months ago ago
    Yes but those transistors moved to greener pastures.
201984 7 months ago ago
```
  mitigations=off
```
Don't care.
[-]
- matja 7 months ago ago
  "Don't mind me running this piece of WASM in a webworker to collect all the useful encryption keys and cookies in your RAM..."
  [-]
  - 201984 7 months ago ago
    Has even a single web exploit ever been found in the wild? Until then, I'm not going to worry and probably not even then.
    [-]
    - dzaima 7 months ago ago
      As long as most people run with mitigations on, you're technically probably indeed safe. But you should still care that things get fixed with mitigations=on otherwise you wouldn't have the shield of "almost everyone has mitigations enabled for this so noone has reason to bother exploiting this"!
    - autoexec 7 months ago ago
      Yes, at least in the Spectre/Meltdown days. https://www.techtarget.com/searchsecurity/news/252434342/Mel...
      [-]
      - teruakohatu 7 months ago ago
        Haven’t browsers closed that hole independently by reducing timing precision?
  - johnnyjeans 7 months ago ago
    Uncaught ReferenceError: WebAssembly is not defined
    [-]
    - vlovich123 7 months ago ago
      You don't need WASM to deploy Spectre/Meltdown. Vanilla JS works just fine which is what was demonstrated in the original paper.
      [-]
      - 7 months ago ago
        [deleted]
      - brobinson 7 months ago ago
        Didn't all the major browsers alter their timing APIs to make this impossible/difficult?
        [-]
        anyfoo 7 months ago ago
        I'm not an expert, but I think you can only make this harder by intentionally making timers less precise (even adding some random fuzz). Someone may correct me if I'm wrong, but I think statistically, a less precise timer means you will just need a longer runtime.
        Suppose you want to measure the distribution of the delay between recurring events (which is basically what's at the heart of those vulnerabilities). Suppose the delays are all sub-milliseconds, and that your timer, to pick something ridiculous, only has a 2 second granularity.
        You may at first think that you cannot measure the sub-millisecond distribution with such a corse timer. But consider that event and timers are not synchronized to each other, so with enough patience, you will still catch some events barely on the left or on the right side of your 2 second timer tick. Do this over a long enough time, and you can reconstruct the original distribution. Even adding some randomness to the timer tick just means you need more samples to suss the statistic out.
        Again, I am not an expert, and I don't know if this actually works, but that's what I came up with intuitively, and it matches with what I heard from some trustworthy people on the subject, namely that non-precision timers are not a panacea.
        [-]
        teruakohatu 7 months ago ago
        > Even adding some randomness to the timer tick just means you need more samples to suss the statistic out.
        If each timer draws from the same random distribution then sure, you could work out the real tick with greater accuracy, but I don’t know if that is practical.
        If the timers draw from different distributions then it is going to be much harder.
        I imagine there is an upper limit of how much processing can be done per tick to before any attack becomes implausible.
        [-]
        anyfoo 7 months ago ago
        > If the timers draw from different distributions then it is going to be much harder.
        Again, I'm an amateur, but I think you just need to know that distribution, which I guess you usually do (open source vs. closed source barely matters there), law of large numbers and all.
        Anyway, looking through literature, this article presents some actual ways to circumvent timers being made corse-grained: https://attacking.systems/web/files/timers.pdf
        In that article, the "Clock interpolation" sounds vaguely related to what I was describing on a quick read, or maybe it's something else entirely... Later, the article mentions alternative timing sources altogether.
        Either way, the conclusion of the article is that the mitigation approach as a whole is indeed ineffective: "[...] browser vendors decided to reduce the timer resolution. In this article, we showed that this attempt to close these vulnerabilities was merely a quick-fix and did not address the underlying issue. [...]"
        [-]
        vlovich123 7 months ago ago
        I believe your understanding of the literature is correct (I too am an amateur when it comes to side channel attacks). My memory is vague here but I believe that while it still lets you exploit side channels, it still requires extra time to do so which lowers the throughput you get out of the gadget.
        vlovich123 7 months ago ago
        They are not a panacea (in some cases - the way Cloudflare Workers does them it does more effectively limit attacks vs how browsers have to work) but slowing down an attack is valuable because it can make the attack infeasible because your ability to retrieve anything damaging is bounded by how long you visit that website.
        [-]
        anyfoo 7 months ago ago
        Fair. Better than nothing.
        vlovich123 7 months ago ago
        They temporarily disabled high resolution timing APIs until they rearchitected how JS got executed in the wake of spectre/meltdown. They sandboxed JS into separate processes by domain (site-isolation) & created the concept of cross-origin isolation. The combination of the two lets you gain back sub-millisecond timers and SharedArrayBuffer which are two gadgets that were particularly useful for the Spectre paper.
  - anthk 7 months ago ago
    UBlock Origin with JS turned off, or NoScript. Good luck.
  - bee_rider 7 months ago ago
    Yeah, he should really turn mitigations on, so that when running arbitrary code from the internet he can be subject to 9999 vulnerabilities, instead of 10,000.
    [-]
    - darkmighty 7 months ago ago
      There are many kinds of vulnerabilities. Most are pretty mundane afaict. Breaking sandboxes and reading out your entire RAM is basically game over, existential vulnerability (second only to arbitrary code execution, though it can give you SSH keys I guess).
      The mitigating factor is actually that you don't go to malicious websites all the time, hopefully. But it happens, including with injected code on ads and stuff that may enabled by secondary vulnerabilities.
    - anyfoo 7 months ago ago
      I challenge you to name another readily available "read arbitrary RAM from userspace"[1] vulnerability.
      [1] Not even including "potentially exploitable from JavaScript", which Spectre was. It's sufficient if you name one where an ordinary userspace program can do it.
      [-]
      - genewitch 7 months ago ago
        Can't you trivially do this with 4 lines of C?
        [-]
        Akronymus 7 months ago ago
        Only if you already have the ability to read arbitrary RAM. So running in kernel/hypervisor mode.
        The exploit is being able to do it from usermode through an api (browser/js) that normally forbids that.
        Userspace can only access its own memory, rather than the whole systems.
        anyfoo 7 months ago ago
        Userspace processes can only read their own memory, or what has been shared with them.
        [-]
        genewitch 7 months ago ago
        so how do programs like Cheat Engine and WeMod work, on windows? they don't request an administrator password, and i can tamper with any processes' memory i've tried, including firefox.exe and the like.
        https://cheatengine.org/
        https://www.wemod.com/
tannhaeuser 7 months ago ago
> All intel processors since the 9th generation (Coffee Lake Refresh) are affected by Branch Privilege Injection. However, we have observed predictions bypassing the Indirect Branch Prediction Barrier (IBPB) on processors as far back as 7th generation (Kaby Lake).
From that piece of text on the blog, I don‘t quite unterstand if Kaby Lake CPUs are affected or not.
[-]
- fwip 7 months ago ago
  At least some Kaby Lake CPUs are affected, but they can't say for sure that all of them are.
  [-]
  - lostmsu 7 months ago ago
    No, I think they are saying that they can only demonstrate exploit on Coffee Lake Refresh and later, but the issue that let them create exploit exists all the way back to Kaby Lake. So they are also probably exploitable, but this specific exploit does not target them.
- chrisweekly 7 months ago ago
  I interpret it as including Kaby Lake.
  [-]
  - autoexec 7 months ago ago
    which would mean that all our intel machines have been vulnerable and defenseless for the last 9 years.
Alcatros552 7 months ago ago
As it seems a lot of people are not aware that this one is a newer generation of branch predictor issue. You can see that Intels eIBRS doesn't mitigate the problems and make them susceptible to attacks. To prevent bigger issues the issue was released after Intel has been informed of the Issue and most systems are patched in the meantime.
dzdt 7 months ago ago
The end-user processor slowdowns from Spectre and Meltdown mitigations were fairly substantial. Has anyone seen an estimate of how much the microcode updates for this new speculative vulnerability are going to cost in terms of slowdown?
[-]
- leonidasv 7 months ago ago
  > Our performance evaluation shows up to 2.7% overhead for the microcode mitigation on Alder Lake. We have also evaluated several potential alternative mitigation strategies in software with overheads between 1.6% (Coffee Lake Refresh) and 8.3% (Rocket lake)
  https://comsec.ethz.ch/research/microarch/branch-privilege-i...
  [-]
  - dzdt 7 months ago ago
    Thanks, missed that! I remember seeing benchmarks showing like 15% slowdown from Spectre/Meltdown mitigations, so this is not as bad as that, but that is on top of the other too I guess...
j45 7 months ago ago
Since the cloud is someone else's computer, and someone else's shared CPU, is cloud hosting (including vps) potentially impacted?
Look forward to learning how this can be meaningfully mitigated.
[-]
- andrewla 7 months ago ago
  Intel claims [1] that they already have microcode mitigation. Like Spectre and Meltdown this is likely to have performance implications.
  [1] https://www.intel.com/content/www/us/en/security-center/advi...
  [-]
  - j45 7 months ago ago
    Spectre and Meltdown had some pretty big performance hits in the beginning. Wonder how much it will differ here in real world, third party (and independent) testing.
- matja 7 months ago ago
  For reads across different VMs on the same CPU, theoretically TME-MK could mitigate the usefulness of the memory reads by having each VM access memory using a different memory encryption key, but I don't know of any hypervisors that implement this.
  AMD has had SEV support in QEMU for a long time, which some cloud hosting providers use already, that would mitigate any such issue if it occurred on AMD EPYC processors.
  [-]
  - wahern 7 months ago ago
    Memory encryption doesn't usually protect against these kinds of side channels. You're not reading memory directly, but inferring it based on discernible behavior of the privileged code. Inasmuch as SEV and TME-MK are marketed as protecting VM guest memory from host machine snooping, they've proven insufficient many times before.[1] In the end, you have to trust your VM hosting provider, and trust that they've written their hypervisors in a robust way that takes into account these unforeseen (yet predictable) issues when isolating guests from each other.
    [1] See, e.g., https://www.amd.com/en/resources/product-security/bulletin/a... and https://www.intel.com/content/www/us/en/developer/articles/t...
  - anonymousDan 7 months ago ago
    Memory encryption typically doesn't keep anything encrypted within the CPU (e.g. in caches). Haven't looked at the details of this bug but I expect it wouldn't help for that reason.
  - j45 7 months ago ago
    Appreciate the insight on the AMD side.
    Their new processors are quite inviting, but like with all CPU’s I’d prefer to keep the entire thing to myself.
The28thDuck 7 months ago ago
Haven’t we been here before? It seems like it’s very similar to the branch prediction exploits of the late 2010s. Is there something particularly novel about this class of exploits?
[-]
- mettamage 7 months ago ago
  Probably, I haven't had time to delve into the article yet. But ever I first learned about them I got the hunch that they'd never fully go away.
  Then people say "no that's not possible, we got security in place."
  So then the researchers showcase a new demo where they use their existing knowledge with the same issue (i.e. scaling-induced voltage leaks).
  I suspect this will go on and on for decades to come.
- dboreham 7 months ago ago
  It's the same exploit, just exploiting a bug in the mitigation for the first exploit.
  [-]
  - LkpPo 7 months ago ago
    Predictable suite.^_^ Disable the feature in BIOS if possible. Branch prediction is a rotten plank that needs to be scrapped. It's a remarkably bad idea that doesn't deserve to be saved.
x3al 7 months ago ago
Is there an easy way to run a browser on some phone/tablet and everything else on a desktop just to isolate the web and JS from accessing your desktop?
pawanjswal 7 months ago ago
Just when we thought Spectre was fading, it pulls a full sequel—Intel CPUs still keeping things spicy!
arkh 7 months ago ago
So is it time for some cryptography coprocessor / cards?
gitroom 7 months ago ago
yeah this just makes me wanna see real world numbers on the slowdown, cuz honestly all these microcode fixes feel like trading off years of speed for maybe a little more peace of mind - you ever think well actually move off this cycle or is it just here to stay?
[-]
- myself248 7 months ago ago
  I'd love to see a graph showing mitigations=off vs mitigations=on speed gains over the years. It feels like most of the new tricks have had vulnerabilities of some sort, which eat up some or most of their gains.
  Of course at launch, the vulns haven't been found yet, so the performance uplift claims are all fluff based on the reckless actions of an unprotected trick. By the time the skeletons have come out of the closet, the release hype cycle is far in the past, and only a few nerd blogs will cover the corresponding performance hit.
  Lather, rinse, repeat for the next launch. New tricks, new reckless performance gains, big uplift numbers, hope it takes a little while before anybody points out the holes in the emperor's clothes. But at the end of the day, the actual performance progress over time is, I suspect, dramatically slower than what the naïve summation of individual launch-day claims would suggest.
  To your second point:
  I think it's here to stay as long as we allow someone else's code (via js, webassembly, and their ilk) to run on our processors as a matter of course.
  A rethink of the entire modern web, back to a simpler markup language that isn't Turing-complete and can't leak data back to the attacker's server, would be needed before we could turn the mitigations off and enjoy the performance we were promised.
  But of course, if we get rid of the modern web as we know it, we probably don't need all that performance anyway. An old BlueWave/QWK mailer consumes 10,000x less resources than a Gmail tab, you know?
unit149 7 months ago ago
[dead]
whatever1 7 months ago ago
It’s dead, can you please stop stubbing it?
[-]
- anonymars 7 months ago ago
  I thought I understand these words, yet I don't understand what you mean
arghwhat 7 months ago ago
> On an up to date Ubuntu 24.04
So not very up to date, but I suppose mitigations haven't changed significantly upstream since then.
[-]
- necubi 7 months ago ago
  24.04 is the most recent LTS (long term support) release; it's what users are meant to be running for anything important
  [-]
  - arghwhat 7 months ago ago
    My point is that it is not representative of the current state of the kernel.
    The kernel has nothing to do with Ubuntu, its release schedule and LTS's. Distro LTS releases also often mean custom kernels, backports, hardware enablement, whatnot, which makes it a fork, so unless were analyzing Ubuntu security rather than Linux security, mainline should be used.
    [-]
    - BoredPositron 7 months ago ago
      Microcode updates have nothing to do with the kernel?
      [-]
      - arghwhat 7 months ago ago
        Mitigations are generally kernel workarounds, not microcode updates. Microcode updates provide some fixes or disable features, but all kernel efforts and why they specify the software at all is because of kernel mitigations.
      - dboreham 7 months ago ago
        Correct.
  - FirmwareBurner 7 months ago ago
    [flagged]
- thomasdziedzic 7 months ago ago
  That version is significant because it is the latest LTS release. Most servers use LTS releases.
- 7bit 7 months ago ago
  There is a difference between an up2date Ubuntu 24.04 and an up2date Ubuntu.
  And as security updates are back ported to all supported versions - and 24.04 being an LTS release, it is as up2date as it gets.
  If you're being pedantic, be the right kind of pedantic ;)
  [-]
  - arghwhat 7 months ago ago
    The problem is that it's downstream backports and hardware enablement - you're running an old forked artisinal kernel maintained by Canonical, you will only get bugfixes if known to be severe enough to be flagged, and all this patching deviates it from mainline and can itself introduce new security vulnerabilities not present in mainline.
    This differs from an actual later release which is closer to mainline and includes all newer fixes, including ones that are important but weren't flagged, and with less risk of having new downstream bugs.
    If you're going to fight pedantism by being pedantic, better be the right kind of pedantic. ;)
- blueflow 7 months ago ago
  Ubuntu 24.04 is the current LTS release. Our are you intending to say that Ubuntu, regardless of version, is not up to date?
  Edit: "LTS" added due to popular demand
  [-]
  - pdpi 7 months ago ago
    You need a qualifier there — the latest Ubuntu release is 25.04, but 24.04 is the current LTS release.
    [-]
    - razemio 7 months ago ago
      It is up to date, with security patches and fixes. That is obviously what is relevant here. That is why the parent comment got down voted, since it is up to date in context of a security vulnerability. It should be even more secure, since new software versions might introduce unknown attack vectors.
  - arghwhat 7 months ago ago
    I am saying that any version of Ubuntu is not representative of the mainline kernel, which is what is relevant when it comes to analyzing current mitigations.
    Distro LTS releases often mean custom kernels, backports, hardware enablement, whatnot, which makes it effectively a fork.
    Unless were interested in discovering kernel variation discrepancies, its more interesting to analyze mainline.
    [-]
    - fc417fc802 7 months ago ago
      I'd expect an awful lot of production workloads to be running on LTS kernel versions (and likely also LTS distro releases). So the mitigations currently available in an LTS release of a mainstream distro are quite relevant.
      [-]
      - arghwhat 7 months ago ago
        They are running their LTS distro's LTS kernel, but that us not an upstream thing
        On an LTS, you'll be running a Canonical kernel, or a Red Hat kernel, or a SuSE kernel, or an Oracle kernel, or...
        Each will have different backports, different hardware enablement, different random patches of choice, and so different bugs and problems.
        Unless were evaluating the security of a particular distro release, mainline is what is Linux and will ultimately be the shared base for future releases.
- fwip 7 months ago ago
  24.04 is an LTS (long term support) release, so it receives updates, including security updates, for much longer than a regular release. I believe it's a 5-year support window, and longer if you shell out for paid support.
  [-]
  - arghwhat 7 months ago ago
    These updates mean that you are no longer running a mainline kernel, but an Ubuntu fork with whatever backports and hardware enablement (and new bugs!) this might introduce. This is also true for other software.
    LTS does not mean you get all updates, it only means you get to drag your feet for longer with random bugfixes. Only the latest release has updates.
    [-]
    - anyfoo 7 months ago ago
      This only matters if the mainline kernel since then somehow experienced changes which would affect this hardware vulnerability (fixed through microcode), which I see no indication of?
      [-]
      - arghwhat 7 months ago ago
        CPU vulnerabilities are first fixed through kernel mitigations, only sometimes through microcode.
        But security research should be done against the current state. Something as simple as a performance optimization can end up affecting the exploitability, and while that doesn't change whether the CPU is vulnerable it does change the conclusion.
        Evaluering if a particular old, forked codebase is security-wise is identical is a fools errand, and then that doesn't answer whether an equivalent Red Hat kernel is vulnerable as that's a different fork with different backports and local patches. Mainline is the shared base.
        [-]
        anyfoo 7 months ago ago
        I don’t quite understand how that matters here. The researchers found a CPU vulnerability. They demonstrated it on a popular Linux distribution and LTS version, Ubuntu 24.04. They likely picked that to show that the attack is not purely theoretical, but feasible on something that real users currently use for real things. There is a microcode fix available that solves this problem, presumably across all OSes and releases. Whether the kernel is current and how much it diverges is, frankly, irrelevant.
        [-]
        arghwhat 7 months ago ago
        They are not just looking for vulnerabilities, they're demonstrating impact which is kernel dependent.
        The kernel has numerous CPU bug mitigations that change kernel behavior to make the CPU bug ineffective for active exploitation (microcode rarely fixes bugs other than just disabling a whole subsystem - they usually take silicon iterations to fix, and the kernel has to pick up the slack), and current kernel design choices may also unintentionally render the vulnerability ineffective.
        That's why they specifically say what OS and version they're running, exactly because it is crucial. It's just that they are not, in fact, up to date when it comes to the kernel.