Becoming a compiler engineer

(rona.substack.com)

284 points | by lalitkale 2 days ago ago

135 comments

ndesaulniers 2 days ago ago
If folks are interested in compilers and looking for where to get started, we're always looking for new contributors:
Building the Linux kernel with LLVM: https://github.com/ClangBuiltLinux/linux/issues
LLVM itself: https://github.com/llvm/llvm-project/issues?q=is%3Aissue%20s...
torginus 2 days ago ago
This is a personal puff piece. Her accomplishments are impressive and well deserved, but she needn't use the title of 'Becoming a Compiler Engineer' as an attack vector to get people interested in writing compilers to read her greatest hits of her early to mid 20s.
The way to become a compiler engineer by definition is to try and write a compiler , for which the best course of action is to focus on learning how tokenizing, ast building, typechecking, and various intermediate representations work.
I don't pretend to know what's the best tutorial for this, but I think this is a fairly good one:
https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/index...
This is for LLVM, but I think doing basic codegen from generic SSA is not a huge leap from this point if one wants to build an entire compiler from scratch.
You do not need to be 'goes to MIT' level of smart, but you do have to understand the basic concepts, which I think is an absolutely manageable amount - about a scope of a couple hundred page paperback or a single challenging CS course worth of info to get started.
[-]
- seanmcdirmid 2 days ago ago
  So few people trained or specialized in language implementation and compiler writing actually get the chance to write compilers. Those jobs are just so rare that many people in that area re-specialize into something else (like AI these days).
  [-]
  - zfnmxt 2 days ago ago
    They aren't that rare. And AI is expanding the niche because making parallel linear algebra go zoom zoom is compiler work. There's also a lot of quantum compiler work.
    [-]
    - seanmcdirmid 2 days ago ago
      Ya, I almost got a quantum compiler job at Alibaba (they decided to go in a different direction), and a job with Microsoft working complied ML support for Julia also fell through (I passed the interview, but they couldn’t get the head count) before ultimately joining Google working on developer experiences.
    - fweimer a day ago ago
      All those LLVM forks need maintainers, too.
      Then there are the people building compilers accidentally, like in the <xyz>-as-code space. Infrastructure automation deal with grammars, symbol tables, (hopefully) module systems, IRs, and so forth. Only the output is very different.
      And of course the toolchain space is larger than just compilers. Someone needs to maintain the assemblers, linkers, debuggers, core runtime libraries. If you are building a Linux distribution, someone has to figure out how the low-level pieces fit together. It's not strictly a compiler engineering role, but it's quite close. Pure compiler engineering roles (such as maintaining a specific register allocator) might be quite rare.
      It's a small field, but probably not that obscure. Despite the efficiency gains from open-source compilers, I don't think it's shrinking.
  - Joel_Mckay 2 days ago ago
    It is a Chicken-or-Egg problem, as most commercial attempts at compilers were awful partially compliant bodges, and or expensive.
    The GNU gcc groups win is rarely considered these days, but kids who were stuck with masm, risc asm, or Basic thought they were heroes to the hobby.
    Slowly, the FOSS compilers grew in features, and commercial entities realized it was less risky to embrace an established compiler ecosystem, and paid people to port in an experienced user-base.
    Starting from scratch is hard, as people often make the same logical mistakes. =3
    "There are two mistakes one can make along the road to truth...not going all the way, and not starting."(Prince Gautama Siddhartha)
- userbinator 2 days ago ago
  IMHO the best tutorial is Crenshaw's "Let's Build a Compiler" series.
  [-]
  - phba a day ago ago
    Thank you for the recommendation! I was looking for compiler texts that take a more practical approach.
- jibal 2 days ago ago
  > This is a personal puff piece.
  Seriously? She's posting on her personal blog (that's what her substack is). Up front she says that one of the two kinds of people she's writing for are those who are interested in her personally.
  > she needn't use
  There are a lot of things that people needn't do, like absurdly talk about "attack vectors".
  [-]
  - torginus a day ago ago
    I might have been a bit harsh in how I phrased it, however if it's just a personal blog post, it does not belong on HN. I don't think most people here are interested in strangers' personal lives.
    If it was her who posted it, she should've made it more clear that this is a 'personal journey' post. If it was someone else without her permission, then that should've not done so.
    If you look at most of the discussion that emerged around it, it's about the relative technical merits of various compiler frameworks and techniques, and getting a job working with compilers, which I think most people expected to read about.
    [-]
    - jibal a day ago ago
      There's so much wrong with this ... no it wasn't her who posted it, and no one needs her permission to do so. (And how do you know that they didn't get it?) And if it didn't belong on HN then it would have been flagged and killed. It wasn't because it isn't just about her personal life--it has useful and appropriate content for readers of HN. again, up front she says that there are two kinds of people she's writing for. (Even then, the "personal journey" aspects of the article are largely relevant for HN readers interested in being compiler engineers.)
      And do you really have no sense of how utterly absurd
      > she needn't use the title of 'Becoming a Compiler Engineer' as an attack vector
      is? Her title is perfectly reasonable and beyond any rational sensible intellectually honest good faith criticism.
      The fault you seek lies entirely within.
      [-]
      - torginus 21 hours ago ago
        First you chew me out for pointing out that this isn't a very good resource for compiler stuff as its more of an autobiographical post.
        Then you turn around and claim that this was a technical article after all. And no, the advice of 'you have to be among the three dozen people that get accepted to MIT to do math' is an unnecessarily elitist piece advice, that's not even necessary, as I (and a lot of others) pointed out there are a ton of good books and resources you can just get started with.
        I know HN disallows editorializing, but the title reads like it'll be an useful resource on compilers. The writer obviously chose that title in context of her personal blog, where it makes sense, but I maintain that a strangers life update is not interesting to the rest of us, and if I were the writer, I wouldn't be too happy if someone decided to direct a ton of eyeballs to my personal post.
        I'd implore you to look around in the thread, most people had different expectations and complained about the same thing I did. Still the post was useful, since an interesting conversation happened around the topic of compilers.
        And I'd like you to realize, that you are just one person sharing his personal opinion, not some arbiter of moral goodness, please stop attacking me, and I feel bad for having to defend myself.
      - jibal 21 hours ago ago
        P.S.
        > First you chew me out for pointing out that this isn't a very good resource for compiler stuff as its more of an autobiographical post.
        I never did any such thing. I didn't even bother to read past that false claim as I expect the rest to be similarly false and dishonest, and won't respond to anything else from that source.
- wat10000 2 days ago ago
  Puff piece? Attack vector? It’s just a personal story.
- rurban 2 days ago ago
  LLVM is certainly not a recommended way to start. That would be to start with a small lisp or ocaml compiler. There you have all the batteries included and the project would still be small enough.
  [-]
  - torginus 2 days ago ago
    Honestly I've read a ton of scaremongering about how awful and complex LLVM was, and I too assumed the worst, but the API is surprisingly benign and well documented for how difficult a task it accomplishes.
    I also checked out Cranelift which was recommended to me as a battle-tested but easier to use alternative, and honestly I haven't found it to be that much simpler.
  - fooker 2 days ago ago
    Sure if you want to build a lisp clone.
    Anything else, you're better off starting off with LLVM. Especially if you want a compiler job.
    [-]
    - rurban 2 days ago ago
      I've got compiler jobs without ever touching LLVM. Even gcc has a better codebase and performance. If just its maintainers would behave as nice as the LLVM maintainers who are really nice.
      Eg with gcc you can write a jit in pure C. With LLVM you'd still need C++ for name lookup hooks. With qbe you won't need any of this. With lisp much less.
      [-]
      - fooker a day ago ago
        Yes, I'm sure you got a compiler job without touching LLVM.
        Doesn't mean it's good advice for someone getting into a field where LLVM is the technology used for 90% of the jobs.
      - torginus a day ago ago
        You got me to check out GCC and QBE, and I say, not having to bother with SSA does seem to make things a bit easier, with GCC's Gimple being a kind of pseudo-assembly language, and Generic being almost C.
        Still I think once you get over the hurdle of SSA-ifying your procedural code, LLVM is all right, and seems to be a lot more feature rich (I might be stupid, but I don't see how to make stack maps in GCC).
        Also GCC is very lightly documented while LLVM seems to have quite good docs and tutorials.
        [-]
        fooker 18 hours ago ago
        You don't need SSA for LLVM.
        Just emit load and store instructions, and it'll be converted automatically.
        What SSA gives you is substantially easier analysis and optimizations on the intermediate representation, so much that all compilers use it (yes, even GCC).
- raldu a day ago ago
  Interesting how titles alone has been getting the front page recently.
- rvz 2 days ago ago
  > You do not need to be 'goes to MIT' level of smart, but you do have to understand the basic concepts, which I think is an absolutely manageable amount - about a scope of a couple hundred page paperback or a single challenging CS course worth of info to get started.
  You certainly don't need to be from 'top 3 college in US' at all. Neither did the creators of Zig (Andrew Kelly) and LLVM, Swift and Mojo (Chris Lattner) ever did.
  All you need is genuine interest in compilers, the basic CS fundamentals including data structures and algorithms. Start by building a toy compiler and then study the existing open-source ones and make some contributions to understand how it all works.
  [-]
  - wat10000 2 days ago ago
    I don’t know about Kelly but Lattner certainly did. UIUC is pretty top for CS. That’s where NCSA is, as in NCSA Mosaic and many others.
- almostgotcaught 2 days ago ago
  > Her accomplishments are impressive and well deserved
  what exactly are those even? that she went to MIT? from her linkedin she's at some blockchain startup (for only 4 months) doing "compiler work" - i put it in quotes because these jobs actually happen to be a dime-a-dozen and the only thing you have to do to get them is pass standard filters (LC, pedigree, etc.).
  [-]
  - jibal a day ago ago
    The article says
    > I was a compiler engineer at a startup in New York City. In that role, I worked on extending a preexisting open-source programming language.
    > I am now a compiler engineer at a large, post-IPO tech company in the San Francisco Bay Area. I work on making programming languages run faster.
    The 4 month startup is the former.
chubot 2 days ago ago
Very interesting and informative!
I'm a bit shocked that it would take significant effort/creativity for an MIT grad with relevant course/project work to get a job in the niche
I would have thought the recruiting pipeline is kinda smooth
Although maybe it's a smaller niche than I think -- I imagine compiler engineers skew more senior. Maybe it's not a common first or second job
I graduated at the bottom of bear market (2001), and it was hard to get a job. But this seems a bit different
[-]
- munificent 2 days ago ago
  > I'm a bit shocked that it would take significant effort/creativity for an MIT grad with relevant course/project work to get a job in the niche
  That bit was heartbreaking to me too. I knew the economy was bad for new grads but if a double major from MIT in SF is struggling, then the economy is cooked.
  [-]
  - verandaguy 2 days ago ago
    While the economy's definitely in a shitty spot (and IMO heading towards shittier), I wouldn't necessarily take this specific line as a sign of the times. The author does outline reasons why demand for compiler engineers (and junior ones in particular) is likely low in her post.
    Compiler development is (for better or worse) a niche that favours people who've got real-world experience doing this. The traditional ways to get in have either been through high-quality, high-profile open-source contribs, or because your existing non-compiler-dev job let you inch closer to compiler development up until the point you could make the jump.
    As the author noted, a lot of modern-day compiler work involves late-life maintenance of huge, nigh-enterprise-type code bases with thousands of files, millions of LOC, and no one person who has a full, detailed view of the entire project. This just isn't experience you get right out of school, or even a year or two on.
    Honestly, I'd say that as a 2023 grad with no mentors in the compiler dev space, she's incredibly lucky to have gotten this job at all (and to be clear, I hope she makes the most of it, compiler dev can be a lot of fun).
    [-]
    - ekidd 2 days ago ago
      Yup, I have been a junior compiler engineer at three(!) different jobs early in my career, before moving on to other stuff.
      It has never been a huge niche. It's fun work if you can get it. There were often MIT grads around, but I don't think it made you an automatic hire?
      Once in a blue moon, for old times' sake, I send a bug fix PR for someone's optimizer, or build a small transpiler for something.
  - laidoffamazon a day ago ago
    Also I’m 99% sure she had another job full time before this
  - laidoffamazon a day ago ago
    Elaborate on this. Is it not sad when a UIUC Or Perdue grad can’t get a compiler engineer job out of undergrad? What does it mean to be cooked?
    [-]
    - david-gpu a day ago ago
      Rephrased: If a graduate with relevant coursework from a top institution struggles to find a job in a particular field, what sort of chances do the rest of the graduates from less known colleges have?
      It makes sense now, doesn't it?
      [-]
      - laidoffamazon a day ago ago
        UIUC and Perdue are not lesser known than MIT. Big state schools are roughly as known they just don’t have the same signal for being a superhuman.
        [-]
        david-gpu 20 hours ago ago
        Well, at least I tried.
- achierius 2 days ago ago
  It's definitely a pretty small world, and to make things worse there are sub-niches -- between which there's certainly cross-pollination, but that's still a barrier to people looking to change jobs. Frontend language semantics (where most PL papers focus) vs. middle-and-back end optimization and hardware support; AoT compilers vs. JITs; CPU targets vs. a blossoming array of accelerators, etc.
  Beyond that, I've definitely interviewed people who seemed like they could have been smart + capable but who couldn't cut it when it came to systems programming questions. Even senior developers often struggle with things like memory layouts and hardware behavior.
- aidenn0 2 days ago ago
  I'm not familiar with the current job market (There is a lot of uncertainty throughout all of the US hiring departments in all fields right now), but it certainly wasn't that hard a couple of years ago.
  Compilers are just programs like anything else. All the compiler developers I know were trained up by working on compilers. Just like people writing B2B ecommerce software learned how to do so by working on B2B e-commerce software and embedded software developers learned how to do so by working on embedded software.
  Heck, a typical CS degree probably covers more of the basics for compilers than B2B ecommerce software or embedded software!
  [-]
  - vidarh a day ago ago
    But there are magnitudes more B2B ecommerce software than compilers for people to get experience on.
- wat10000 2 days ago ago
  I’d expect it to be a pretty small niche. How many companies need compiler engineers? Some big tech companies have compiler groups, but they’re a small part of the business. Most software companies are consumers of compilers, not producers.
tiu 2 days ago ago
The comments are wildly fragmented in this thread. I agree with @torginus, the article has less and less of anything useful to people that want to get into compilers.
Anyways, the "Who .. hires compiler engineer?" section is fairly vague in my opinion, so: AMD, Nvidia, Intel, Apple, Google definitely hire for compiler positions. These hire fairly 'in-open' so probably the best bets all around. Aside from this, Jane Street and Bloomberg also do hire at the peak tier but for that certain language. The off beat options are: Qualcomm, Modular, Amazon (AWS) and ARM. Also see, https://mgaudet.github.io/CompilerJobs/
I seriously attempted getting into compilers last year before realising it is not for me but during those times it felt like people who want to be compiler devs are much much more in number compared to jobs that exist (yes exist, not vacant).
The common way to get going is to do LLVM. Making a compiler is great and all but too many people exist with a lox interpreter-compiler or something taken from the two Go books. Contributing to LLVM (or friends like Carbon, Swift, Rust) or atleast some usage experience is the way. The other side of this is doing GNU GCC and friends but I have seen like only one opening that mentions this way as being relevant. University level courses are rarely of any use.
Lastly, LLVM meetups/conferences are fairly common at most tech hubs and usually have a jobs section listing all requirements.
A few resources since I already made this comment too long (sorry!):
[0]: https://bernsteinbear.com/pl-resources/ [1]: https://lowlevelbits.org/how-to-learn-compilers-llvm-edition... [2]: https://www.youtube.com/@compilers/videos
[-]
- wiseowise a day ago ago
  > Making a compiler is great and all but too many people exist with a lox interpreter-compiler or something taken from the two Go books
  Damn, you don’t hold back, do you?
  [-]
  - vidarh a day ago ago
    It's not that it's bad that people have written a compiler. It's that having written a simple one isn't a very useful indicator.
- doix 2 days ago ago
  Semiconductor companies developing DSPs also likely hire them. My first job was writing an LLVM backend for a DSP.
  Looking through the domains in the LLVM mailing list or the git commits should get you a longer list of "off beat" options.
- mattgreenrocks 2 days ago ago
  Good synopsis! I enjoyed my time doing some compiler-type work in the past, but there are so few job openings that it can feel quite cramped after awhile, especially without great experience/credentials.
  Definitely worth some self-study, however, if only for the healing effect of being exposed to a domain where the culture is largely one of quality instead of...everything except that. :)
- MatejKafka a day ago ago
  Microsoft also has many engineers working on compilers, with open positions - MSVC, C#, F#, CLR, rustc and other projects.
- criemen a day ago ago
  > but for that certain language
  What do you mean by that?
  [-]
  - roarcher a day ago ago
    I assume they mean those firms hire compiler engineers to work on the specific languages they use. Jane Street famously uses OCaml for pretty much everything. Not sure about Bloomberg, though a quick search shows that they have Bloomberg Query Language and Bloomberg Scripting Language, both proprietary.
    [-]
    - tiu 19 hours ago ago
      Thanks!
      Bloomberg also does use OCaml by the way, although probably not to the extent of Jane Street.
munificent 2 days ago ago
Tangential but since she mentions her book, "You Had Me At Hello World", is the cutest title for a nerd romance novel that I can imagine.
[-]
- ge96 2 days ago ago
  I'm thinking "et tu btrfs?"
- laidoffamazon a day ago ago
  It was supposed to be out years ago! She got a substantial advance but presumably delayed it due to the plagiarism scandal
  [-]
  - wavemode a day ago ago
    > plagiarism scandal
    Do tell
    [-]
    - laidoffamazon a day ago ago
      From one of the publishers she worked with - https://www.halfmystic.com/blog/you-are-believed
rs186 a day ago ago
I cannot recall the last time (if ever) I saw any article on HN that has a "this is a photo of me" in the middle of it, coming out of nowhere.
[-]
- zvmaz a day ago ago
  Maybe a less subtle self-promotional blog post, as compared to others.
hasbot a day ago ago
In the 80's I wanted to be a compiler engineer. Got a masters degree in it and published a paper on LR parsing with original research in the Journal of ACM. The opportunities back then were scarce. Over nearly 15 years I found a couple of gigs that consumed a few years. But it was hard and time consuming to develop the knowledge and skills. I used to study the PCC and GCC source code! I worked on GUIs between these gigs and when Java/Swing dropped, I switched full-time to GUIs. There were far more opportunities and I enjoyed developing GUIs for a time so it was a good switch.
pkd 2 days ago ago
I'm almost more interested in how a 20-something with no apparent prior pedigree lands a Simon and Schuster debut novel contract!
[-]
- thereitgoes456 2 days ago ago
  She lost that contract after being found guilty of plagiarism. That’s why she avoids mentioning her considerable writing career at all
  [-]
  - jasonjmcghee 2 days ago ago
    It's fiction what is she plagiarizing
    [-]
    - thereitgoes456 2 days ago ago
      What I just said is a fact. Look it up if you like
      [-]
      - defrost 2 days ago ago
        The similarities are intriging but not compelling.
        https://docs.google.com/document/d/1pPE6tqReSAXEmzuJM52h219f...
        Stories of "asian face" actresses with eyes taped back, prominent pieces of anti asian grafitti on walls and drawn in bathrooms are common tropes in asian communities, etc.
        The examples of plagiarism are examples of common story arcs, with an educated asian female twist, and use of examples that multiple writers in a shared literary pool would have all been exposed to; eg: it could be argued that they all drew from a similar well rather thn some were original and others copied.
        There's a shocked article: https://www.halfmystic.com/blog/you-are-believed that may indeed be looking at more evidence than was cited in the google docs link above which would explain the shock and the dismissal of R.W. as a plagiarist.
        The evidence in the link amounts to what is common with many pools of proto writers though, lots of similar passages, some of which have been copied and morphed from others. It's literally how writers evolve and become better.
        I'm on the fence here, to be honest, I looked at what is cited as evidence and I see similar stories from people with similar backgrounds sharing common social media feeds.
        [-]
        sarchertech 2 days ago ago
        One of her publishers pulled her book from print, publicly accused her of plagiarism, and asked other publishers to denounce her for plagiarism.
        That’s pretty damning evidence. If a publisher was on the fence they might pull her books quietly, but they wouldn’t make such a public attack without very good evidence that they thought would hold up in court. There was no equivocation at all.
        [-]
        thereitgoes456 2 days ago ago
        Said publisher also claims Rona directly admitted plagiarism to them. That’s probably why they’re so confident.
        defrost 2 days ago ago
        That's a pretty damning response, sure.
        The evidence, at least the evidence that I found cited as evidence, appears less damning.
        Perhaps there is more damning evidence.
        What I found was on the order of the degree of cross copying and similar themes, etc. found in many pools of young writers going back through literary history.
        Rona Wang, whom I've never previously heard of, clearly used similar passages from her peers in a literary group and was called out for it after receiving awards.
        I would raise two questions, A) was this a truly significant degree of actual plagarism, and 2) did any of her peers in this group use passages from any of Tona's work ?
        On the third hand, Kate Bush was a remarkable singer / song writer / performer. Almost utterly unique and completely unlike any contempory.
        That's ... highly unusual.
        The majority of writers, performers, singers, et al. emerge from pools that differ from their prior generations, but pools none the less that are filled with similarity.
        The arc of careers of those that rise from such origins is really the defining part of many creators.
        [-]
        sarchertech 2 days ago ago
        It is evidence because a strong condemnation raises the likelihood that the accusation is true.
        It doesn’t prove anything, but it supports the theory that they have seen additional evidence.
        After researching this a bit, it looks like someone from publisher says she admitted it to them. That certainly explains why they weren’t afraid to publicly condemn her.
        thereitgoes456 2 days ago ago
        So typical of what I dislike about Hacker News. Are you a fiction writer? Who are you to think you have any useful insight into whether it’s plagiarism?
        It’s compelling evidence to me, and seemingly no actual fiction writer says otherwise, and even Rona has not tried to defend herself from accusations, merely (people say) hiring a PR firm to wipe the slate clean.
        With the roles reversed: if a writer passed judgment on whether a piece of code had been plagiarized, nobody would, or should, listen to them. Why would this be different?
        drysine 9 hours ago ago
        >On the third hand
        On the gripping hand
        neilv a day ago ago
        Thanks, I looked at some of those examples. Several I saw were suspiciously similar, and I wonder how they got that way. Others didn't look suspicious to me.
        I wonder whether the similar ones were the result of something innocent, like a shared writing prompt within the workshop both writers were in, or maybe from a group exercise of working on each others' drafts.
        Or I suppose some could be the result of a questionable practice, of copying passages of someone else's work for "inspiration", and rewriting them. And maybe sometimes not rewriting a passage enough.
        (Aside relevance to HN professions: In software development, we are starting to see many people do worse than copy&revise a passage plagiarism. Not even rewriting the text copy&pasted from an LLM, but simply putting our names on it internally, and company copyrights on it publicly. And the LLM is arguably just laundering open source code, albeit often with more obfuscation than a human copier would do.)
        But for a lot of the examples of evidence of plagiarism in that document, I didn't immediately see why that passage was suspect. Fiction writing I've seen is heavily full of tropes and even idiomatic turns of phrase.
        Also, many stories are formulaic, and readers know that and even seek it out. So the high-powered business woman goes back to her small town origins for the holidays, has second-chance romance with man in a henley shirt, and she decides to stay and open a bakery. Sprinkle with an assortment of standard subgenre trope details, and serve. You might do very original writing within that framework, but to someone who'd only ever seen two examples of that story, and didn't know the subgenre convention, it might look like one writer totally ripped off the other.
      - jasonjmcghee 2 days ago ago
        No I'm literally saying - she writes fiction- how can you plagiarize a fiction book and make it work lol
        (I have no knowledge / context of this situation - no idea if she did or what happened here)
        [-]
        jibal 2 days ago ago
        You don't seem to know what plagiarism is.
        typpilol 2 days ago ago
        You can't plagiarize fiction?
        So if I copy paste Harry Potter that's ok?
        What kind of argument is that
        [-]
        jasonjmcghee 2 days ago ago
        Absolutely not saying this or making this argument.
        I just don't see how this could possibly work - how would slapping Harry Potter in the middle of the book your writing work
        [-]
        sarchertech 2 days ago ago
        Instead of slapping Harry Potter in the middle of your book wholesale, imagine you lifted a few really good lines from Harry Potter, a few from Lord of the Rings, and more from a handful of other books.
        Read the evidence document another poster linked for actual examples.
        [-]
        1718627440 a day ago ago
        To me as a dumb reader, that would be fine, maybe the author could have mentioned that he likes these authors and takes them as inspirations. Also you can't really forbid books to never have references to pop culture. And at some level of famous-ness passages and ideas loose their exclusive tie to the original book and become part of the list of common cultural sayings.
        [-]
        sarchertech a day ago ago
        >could have mentioned
        Well plagiarism by definition means passing the work off as your own without crediting the author, so in that case it isn’t plagiarism.
        References to pop culture are the same as lifting sentences from other books and pretending you wrote them.
        > And at some level of famous-ness passages and ideas loose their exclusive tie to the original book and become part of the list of common cultural sayings
        In the actual case being examined the copied references certainly hadn’t reached any such level of famousness.
        Also there’s a difference between having a character tell another “not all those who wander are lost” as a clear reference to a famous quote from LOTR and copying multiple paragraph length deep cuts to pass off as your own work.
        [-]
        1718627440 a day ago ago
        > Well plagiarism by definition means passing the work off as your own without crediting the author, so in that case it isn’t plagiarism.
        Of course, but wrote 'could' and not 'should' for a reason, I won't expect it. A book isn't a paper and the general expectation is that the book will be interesting or fun to read and not that it is original. That means the general expectation is not that it is never a rehash of existing ideas. I think ever book including all the good ones is. A book that invents the world from scratch might be novel, but unlikely what people want to read.
        > copying multiple paragraph length deep cuts to pass off as your own work.
        If that is true, it sounds certainly fishy, but that is a case of violation of copyright and intellectual property and not of plagiarism.
        [-]
        sarchertech a day ago ago
        > That means the general expectation is not that it is never a rehash of existing ideas.
        There’s a different from rehashing existing ideas and copying multiple passages off as your own.
        > If that is true, it sounds certainly fishy, but that is a case of violation of copyright and intellectual property and not of plagiarism.
        What exactly do you think plagiarism is? Here’s one common definition:
        “An instance of plagiarizing, especially a passage that is taken from the work of one person and reproduced in the work of another without attribution.”
        [-]
        1718627440 a day ago ago
        > What exactly do you think plagiarism is? Here’s one common definition:
        Both are about passing of something of your own. Plagiarism is about passing ideas of insights of as your own. It doesn't really matter, whether you copy it verbatim, present it in your own words or just use the concept. It does however matter how important that idea/concept/topic is in your work and the work you took it from without attribution, and whether that is novel or some generally available/common knowledge.
        For violation of intellectual property it is basically the opposite. It doesn't matter, whether the idea or concept is fundamental for your work or the other work you took it from, but it does matter, whether it is a verbatim quote or only the same basic idea.
        Intellectual property rights is something that is enforced by the legal system, while plagiarism is an issue of honor, that affects reputation and universities revoke titles for.
        > There’s a different from rehashing existing ideas and copying multiple passages off as your own.
        Yes and that's the difference between plagiarism and violating intellectual property/copyright.
        But all this is arguing about semantics. I don't have the time to research whether the claims are true or not, and I honestly don't care. I have taken from the comments that it was only the case, that she rehashed ideas from other books, and I wanted to point out, that while this is a big deal for academic papers, it is not for books and basically expected. (Publishers might have different ideas, but that is not an issue of plagiarism.) If it is indeed the case that she copied other authors verbatim, then that is something illegal she can be sued for, but whether this is the case is for the legal system to be determined, not something I should do.
        [-]
        sarchertech a day ago ago
        >I have taken from the comments that it was only the case, that she rehashed ideas from other books, and I wanted to point out, that while this is a big deal for academic papers, it is not for books and basically expected.
        In addition to near verbatim quotes, she is also accused of copying stories beat for beat. That's much different than rehashing a few ideas from other works. It is not expected and it is very much considered plagiarism by fiction writers.
        As for the quotes she copied. That is likely both a copyright violation and plagiarism.
        Plagiarism isn't just about ideas but about expressions of those ideas in the form of words.
        Webster's definition:
        "to steal and pass off (the ideas or words of another) as one's own : use (another's production) without crediting the source"
        "to commit literary theft : present as new and original an idea or product derived from an existing source"
        Oxford learner's dictionary:
        "to copy another person’s ideas, words or work and pretend that they are your own"
        Copying verbatim or nearly verbatim lines from a work of fiction and passing them off as your own is both plagiarism and copyright violation.
        [-]
        1718627440 20 hours ago ago
        So I won't defend what was done here, there doesn't seem much to argue.
        > copying stories beat for beat. That's much different than rehashing a few ideas from other works. It is not expected and it is very much considered plagiarism by fiction writers.
        Some operas are a greek play. There rehashes of the Faust, the Beggars Opera is a copy of a play from Shakespeare, there are modern versions of Pride and Prejustice, there are tons of stories that are a copy of the Westside Story, which is itself a copy of Romeo and Julia, which I thinks comes from an even older story. This often don't come with any attribution at all, although the listener is sometimes expected to know that the original exists. They change the settings, but the plot is basically the same. Do you consider all of that to be plagiarism? These would be all a reason to call it plagiarism when considering a paper, but for books nobody bats an eye. This is because authors don't sell abstract ideas or a plot, they sell concrete stories.
        [-]
        sarchertech 5 hours ago ago
        First, the stories you mentioned are very famous. The audience watching Oh Brother Where Art Thou is aware it’s an adaptation of the Odyssey. Therefore it’s not someone attempting to pass off work as their own.
        The stories this authors copied were either unpublished manuscripts she got access to in writers groups or very obscure works that it’s unlikely her readers had read.
        Second, the examples you gave were extremely transformative. Just look at the differences between Westside Story and Romeo and Juliette. It’s a musical for goodness sake. It subverts expectations by letting Maria live through it.
        The writings at issue are short stories, so there’s less room for transformation in the first place. And there was clearly not even a strong attempt at transformation. The author even kept some of the same character names.
        There was no attempt to subvert expectations largely because the audience had expectations, since they weren’t aware of the originals.
        >change settings
        She didn’t even do that.
        > for books nobody bats an eye
        If a popular book were revealed to be a beat for beat remake of an obscure novel with the same setting, similar dialogue, some of the same character names, and few significant transformative elements, you can bet your life there would be a scandal.
mattrighetti 2 days ago ago
> I was seriously considering starting a compilers YouTube channel even though I’m awkward in front of the camera.
Doesn’t need to be a YT channel, a blog where you talk about this very complex and niche stuff would be awesome for many.
[-]
- 1970-01-01 a day ago ago
  Starting a channel just to stand out and land a first job really puts a spotlight on the sad situation of hiring in this job market. Imagine if you needed to record videos of yourself building and driving a car to land a job as a mechanic.
thxforthepost 2 days ago ago
Made an account to say thank you for sharing this post (and to Rona Wang for writing it)! I stumbled into having an interview for a Compiler Engineer position coming up and I wasn't sure how to prepare for it (the fact that I got this interview just goes to show how little people really know about Compilers if they're willing to take a chance on a normal C++ dev like me hah) and I had absolutely NO idea where to even begin (I was just working through Crafting Interpreters[1] that I picked up at the end of my contractorship last week but that's to make an Interpreter, not to make a Compiler)
...And honestly it seems that I'm screwed. And I need about 6 months of study to learn all this stuff. What I'd do right now is finish Crafting Interpreters, then grab that other book on Interpreters that was recommended here recently[2] and written in Go because I remember it had a followup book on Compilers, and THEN start going through the technical stuff that Rona suggested in the article.
And my interview is on Monday so that's not happening. I have other more general interviews that should pay better so I'm not too upset. If only I wasn't too lazy during my last position and kept learning while working. If the stars align and somehow I get that Compiler Engineer position, then I will certainly reach out to Rona and thank you again lalitkale for sharing this post with HN!
[1] https://craftinginterpreters.com/
[2] https://interpreterbook.com/
[-]
- moregrist 2 days ago ago
  In my dabbling with compilers I’ve found Andrew Appel’s books [0] to be invaluable for understanding backend (after parsing) compiler algorithms. It’s a bit dated but covers SSA and other still-relevant optimizations and is pretty readable.
  There are three versions (C, ML, and Java). The language isn’t all that important; the algorithms are described in pseudo-code.
  I also find the traditional Dragon Book to be somewhat helpful, but you can mostly skip the parsing/frontend sections.
  [0] https://www.cs.princeton.edu/~appel/modern/java/
- imvetri 2 days ago ago
  are you sure the content of the article is from some other article, quite a old one. as far as i remember
zerr 2 days ago ago
Most (all?) of compiler engineering jobs I've seen were about writing glue code for LLVM.
[-]
- achierius 2 days ago ago
  All the ones I've had, and most of the ones I've seen, we for bespoke compilers and toolchains for new HW / specific languages
- almostgotcaught 2 days ago ago
  Lol I love these clueless takes. I'm just curious who you thinks actually writes the stuff within LLVM? Chris lattner? Lololol
rrgok 2 days ago ago
RIP my dreams of becoming a professional parentheses balancer
getnormality a day ago ago
According to LinkedIn, the author landed at a small crypto startup. Any guesses as to why this startup needs a compiler engineer?
silcoon 2 days ago ago
Interesting article to get a bit more knowledge about the field. I went quickly trough some of the books cited and I have the same feeling that they’re not very practical. Also I didn’t find many practical books about LLVM either.
I would like to read in the future about what is the usual day of a compiler engineer, what you usually do, what are the most enjoyable and annoying tasks.
[-]
- eredengrin 2 days ago ago
  I've heard good things about "LLVM Techniques, Tips, and Best Practices" [0] but haven't gotten around to reading it myself yet. Packt does not always have the best reputation but it was recommended to me by someone I know and the reviews are also solid, so mentioning in case it's at all helpful.
  0: https://www.packtpub.com/en-us/product/llvm-techniques-tips-...
  [-]
  - silcoon a day ago ago
    Thanks! I hoped that someone would come with some suggestions
- achierius a day ago ago
  I can at least give my favorite and least ...
  - most enjoyable: fiddling with new optimizations
  - least enjoyable: root-causing bugs from customer crash stacks
alyxya 2 days ago ago
It's a bit sad seeing how much focus there is on using courses and books to learn about compilers.
> I’m not involved in any open-source projects, but they seem like a fantastic way of learning more about this field and also meeting people with shared interests. I did look into Carbon and Mojo but didn’t end up making contributions.
This sounds like the best way to learn and get involved with compilers, but something that's always been a barrier for me is just getting started in open source. Practical experience is far more valuable than taking classes, especially when you really need to know what you're doing for a real project versus following along directions in a class. Open source projects aren't usually designed to make it easy for anyone to contribute with the learning curve.
> So how the hell does anybody get a job?
> This is general advice for non-compilers people, too: Be resourceful and stand out. Get involved in open-source communities, leverage social media, make use of your university resources if you are still in school (even if that means starting a club that nobody attends, at least that demonstrates you’re trying). Meet people. There are reading groups (my friend Eric runs a systems group in NYC; I used to go all the time when it was held in Cambridge). I was seriously considering starting a compilers YouTube channel even though I’m awkward in front of the camera.
There's a lot of advice and a lot of different ways to try to find a job, but if I were to take away anything from this, it's that the best way is to do a lot of different meaningful things. Applying to a lot of jobs or doing a lot of interview prep isn't very meaningful, whereas the most meaningful things have value in itself and often aren't oriented towards finding a job. You may find a job sooner if you prioritize looking for a job, similar to how you may get better grades by cramming for a test in school, but you'll probably get better outcomes by optimizing for the long term in life and taking a short term loss.
fweimer a day ago ago
What does the other side look like? How would you go about finding people interested in this space, and who are not yet part of the LLVM and GNU toolchain communities (at least not in a very visible way)?
supriyo-biswas a day ago ago
I feel like the comments are too negative here. Yes, it may not be a step by step guide, but for niche roles like this, I feel no such guide could exist and only self study and some company taking the opportunity on you are the only ways in which one could get a systems/low-level job.
amelius a day ago ago
It's even harder to become a Compiler Rockstar. I know only three, Stallman, Peyton Jones and perhaps that guy from Svelte.
[-]
- silcoon a day ago ago
  Guy Steele, Rich Hickey, James Gosling, Kernighan and Ritchie, Guido Van Rossum, Bjarne Stroustrup.
  Harder because the bar is really high.
  [-]
  - tmtvl a day ago ago
    Nickles Worth (or Niklaus Wirth if you prefer to call him by name rather than value).
  - amelius a day ago ago
    Wasn't Stroustrup more a language designer than a compiler guy?
- hasbot a day ago ago
  For compiler fans, that's a very incomplete list. I'm rusty but off the top of my head, I see you're missing Walter Bright and Terrence Parr.
- epolanski a day ago ago
  Hejlsberg too (Turbo Pascal, C#, TypeScript).
  There's probably hundreds of other brilliant engineers more with insane impacts that never got any popularity.
goatsi 2 days ago ago
Step one: no engineering education, just get a job that a company calls engineering.
>In 2023, I graduated from MIT with a double major in math and computer science.
[-]
- wffurr 2 days ago ago
  Great series about whether programming is “engineering” or not: https://www.hillelwayne.com/post/are-we-really-engineers/
anon291 2 days ago ago
I've been in compiler engineering now for almost a decade. No grad school, just moved into the field and did a lot of random projects for my own entertainment (various compilers for toy languages, etc). It takes a particular type of person who cares about things like correctness. It is a very bizarre subsection of people with an improbably high number of transgender people and an equally high number of rad trad Catholics.
Which is to say that all it takes is an interest in compilers. That alone will set you apart. There's basically no one in the hiring pipeline despite the tech layoffs. I'm constantly getting recruiting ads. Current areas of interest are AI (duh) but also early-stage quantum compute companies and fully-homomorphic encryption startups. In general, you will make it farther in computer science the more niche and hard you go.
[-]
- vatsachak 2 days ago ago
  Rad trad Catholics makes sense because Catholic theology involves a lot of logic
  [-]
  - 1718627440 a day ago ago
    True, the scholastics were an effort to derive theology from philosophy and logic and are how Asian and Greek philosophy and culture became part of the Western world, which then evolved into Western logic, Math and Philosophy.
    The cultural importance of education in Jewry, the preservation of that in Christianity and that Christians can never take any understanding of something for granted and always need to question everything, because the Universe (being created by God) will always be more complex than any current knowledge of it, is the origin of the Western concept of Empirism and formalized (Natural-)Science, even if a lot of modern Atheists like to sweep that under the rack.
    A lot of early and also late scientists researched as a part to understand the world their God created, meaning they understood it as a approach to worship God.
fithisux 2 days ago ago
In truth we need a curriculum to help people learn how to become compiler engineers.
Hands-on balanced with theory.
We need more compilers (and interoperability of course) and less dependence on LLVM.
calvinmorrison 2 days ago ago
being a compiler engineer is like making it in hollywood with a lot less glam. There are maybe 10-15 serious compiler projects out there, think LLVM, GCC, Microsoft, Java, then you've got virtual language bytecode intepreters.
The world needs maybe what, 5000, 10000 of these people maximum? In a world with 8 billion people?
[-]
- achierius 2 days ago ago
  There's more than that. Not a huge amount more but still.
  - multiple large companies contribute to each of the larger AoT compiler projects; think AMD's contributions to LLVM and GCC, and multiple have their own internal team for handling compiler work based on some OSS upstream (eg Apple clang)
  - various companies have their own DSLs, eg meta's Hack, the python spinoff Goldman has going on, etc.
  - DBs have query language engineers which are effectively compiler roles
  - (most significantly) most hardware companies and accelerators need people to write toolchains; Triton, Pytorch, JAX, NVCC, etc. all have a lot of people working on them
matt3210 a day ago ago
Is this an ad or something? I was hoping for technical details :(
rvz 2 days ago ago
Great article. Here is a very simple test that I use to find very cracked compiler engineers on this site.
Just search for either of the words "Triton", "CUDA", "JAX", "SGLang" and "LLVM" (Not LLM) and it filters almost everyone out on "Who wants to be Hired' with 1 or 2 results.
Where as if you search "Javascript", 200+ results.
This tells me that there is little to no interest in compiler engineering here (and especially in startups) unless you are at a big tech company or at one of the biggest AI companies that use these technologies.
Of course, the barrier is meant to be high. but if a recruiter has to sift through 200+ CVs a page of a certain technology (Javascript), then your chances of getting selected against the competition for a single job is vanishingly small.
I said this before and it works all the time, for compilers; open-source contributions to production-grade compiler projects with links to commits is the most staightforward differentiator and proof one can use to stand out against the rest.
[-]
- zdragnar 2 days ago ago
  I can't think of any of my employers I've had in the last 15 years that would have cared that I committed code to a compiler project, with one exception. That one exception would have told me they'd rather have me work on a different product than the one I was applying to, despite the one I was applying to being more interesting to me than debugging compilers all day.
  YMMV, I guess, but you're better off demonstrating experience with what they're hiring for, not random tech that they aren't and never will use.
imvetri 2 days ago ago
This is an old content and reposted in this article.
laidoffamazon a day ago ago
For those unaware or may find her name familar, Rona is known for her plagiarism scandal. She blocks anyone on Twitter who asks about it or the book she got a substantial advance for but didn’t publish after this scandal came to light. She seems to have walked away from it - this utter elite impunity makes me sick.
https://www.halfmystic.com/blog/you-are-believed
_mocha a day ago ago
This is honestly one of the worst blog posts I've ever read, and probably does a disservice to representing MIT grads (who shaped my entire career 20-30 years ago). Anyways, as someone who was in this space, my 2 pieces of advice are: 1) either get a PhD in the field (and Apple would pick you up relatively easily) 2) have a small history of contributing to languages like rust, go or be prominent on the clang committees, llvm, ghc.
At least up until 5 years ago, the bar to join compiler teams was relatively low and all it required was some demonstration of effort and a few commits.
(Disclosure: am retired now)
phendrenad2 2 days ago ago
Not many companies are willing to maintain a compiler... but LLMs will change that. An LLM can find bugs in the code if the "compiler guru" is out on vacation that day. And yes, you will still need a "compiler guru" who will use the LLM but do so at a much higher level.
[-]
- DonaldPShimoda 2 days ago ago
  I'm desperately looking forward to, like, 5-10 years from now when all the "LLMs are going to change everything!!1!" comments have all but completely abated (not unlike the blockchain stuff of ~10 years ago).
  No, LLMs are not going to replace compiler engineers. Compilers are probably one of the least likely areas to profit from extensive LLM usage in the way that you are thinking, because they are principally concerned with correctness, and LLMs cannot reason about whether something is correct — they only can predict whether their training data would be likely to claim that it is correct.
  Additionally, each compiler differs significantly in the minute details. I simply wouldn't trust the output of an LLM to be correct, and the time wasted on determining whether it's correct is just not worth it.
  Stop eating pre-chewed food. Think for yourself, and write your own code.
  [-]
  - dullcrisp 2 days ago ago
    I bet you could use LLMs to turn stupid comments about LLMs into insightful comments that people want to read. I wonder if there’s a startup working on that?
  - phendrenad2 2 days ago ago
    I'm screenshotting this, let's see who's right.
    Actually, your whole point about LLMs not being able to detect correctness is just demonstrably false if you play around with LLM agents a bit.
    [-]
    - 1718627440 a day ago ago
      A system outputting correct facts, tells you nothing about the system's ability to prove correctness of facts. You can not assert that property of a system by treating it as a black box. If you are able to treat LLMs as a white box and prove correctness about their internal states, you should tell that to some very important people, that is an insight worth a lot of money.
- bigstrat2003 2 days ago ago
  For that, they would need to make LLMs not suck at easy programming tasks. Considering that with all the research and money poured into it they still suck at easy stuff, I'm not optimistic.
- est31 2 days ago ago
  LLMs (or LLM assisted coding), if successful, will more likely make the number of compilers go down, as LLMs are better with mainstream languages compared to niche ones. Same effect as with frameworks. Less languages, less compilers needed.
  [-]
  - cube2222 2 days ago ago
    I mostly disagree.
    First, LLMs should be happy to use made up languages described in a couple thousand tokens without issues (you just have to have a good llm-friendly description, some examples). That and having a compiler it can iterate with / get feedback from.
    Second, LLMs heavily reduce ecosystem advantage. Before LLMs, presence of libraries for common use cases to save myself time was one of the main deciding factors for language choice.
    Now? The LLM will be happy to implement any utility / api client library I want given the API I want. May even be more thoroughly tested than the average open-source library.
    [-]
    - achierius 2 days ago ago
      Have you tried having an LLM write significant amounts of, say, F#? Real language, lots of documentation, definitely in the pre-training corpus, but I've never had much luck with even mid sized problems in languages like it -- ones where today's models absolutely wipe the floor in JavaScript or Python.
      [-]
      - torginus 2 days ago ago
        Even best in class LLMs like GPT5 or Sonnet 4.5 do noticeably worse in languages like C# which are pretty mainstream, but not on the level of Typescript and Python - to the degree that I don't think they are reliably able to output production level code without a crazy level of oversight.
        And this is for generic backend stuff, like a CRUD server with a Rest API, the same thing with an Express/Node backend works no trouble.
      - cube2222 2 days ago ago
        I’m doing Zig and it’s fine, though not significant amounts yet. I just had to have it synthesize the latest release changelog (0.15) into a short summary.
        To be clear, I mean specifically using Claude Code, with preloaded sample context and giving it the ability to call the compiler and iterate on it.
        I’m sure one-shot results (like asking Claude via the web UI and verifying after one iteration) will go much worse. But if it has the compiler available and writes tests, shouldn’t be an issue. It’s possible it causes 2-3 more back and forths with the compiler, but that’s an extra couple minutes, tops.
        In general, even if working with Go (what I usually do), I will start each Claude Code session with tens of thousands of tokens of context from the code base, so it follows the (somewhat peculiar) existing code style / patterns, and understands what’s where.
      - phendrenad2 2 days ago ago
        Humans can barely untangle F# code..
  - phendrenad2 2 days ago ago
    See, I'm coming from the understanding that language development is a dead-end in the real world. Can you name a single language made after Zig or Rust? And even those languages haven't taken over much of the professional world. So when I say companies will maintain compilers, I mean DSLs (like starlark or RSpec), application-specific languages (like CUDA), variations on existing languages (maybe C++ with some in-house rules baked in), and customer-facing config languages for advanced systems and SaaS applications.
    [-]
    - jibal 2 days ago ago
      Yes, several, e.g., Gleam, Mojo, Hare, Carbon, C3, Koka, Jai, Kotlin, Reason ... and r/ProgrammingLanguages is chock full of people working on new languages that might or might not ever become more widely known ... it takes years and a lot of resources and commitment. Zig and Rust are well known because they've been through the gauntlet and are well marketed ... there are other languages in productive use that haven't fared well that way, e.g., D and Nim (the best of the bunch and highly underappreciated), Odin, V, ...
      > even those languages haven't taken over much of the professional world.
      Non sequitur goalpost moving ... this has nothing to do with whether language development is a dead-end "in the real world", which is a circular argument when we're talking about language development. The claim is simply false.
    - WhyOhWhyQ 2 days ago ago
      This seems like a case of moving the goalposts because Zig and Rust still seem newfangled to me. I thought nothing would come after C++11.
    - anon291 2 days ago ago
      Bad take. People said the same about c/c++ and now rust and zig are considered potential rivals. The ramp up is slow and there's never going to be a moment of viral adoption the way we're used to with SaaS, but change takes place.