A visual introduction to big O notation

(samwho.dev)

648 points | by samwho 3 days ago ago

185 comments

jcalx 2 days ago ago
This article and its associated HN comment section continue in the long tradition of Big O Notation explainers [0] and getting into a comment kerfuffle over the finer, technical points of such notation versus its practical usage [1]. The wheel turns...
[0] https://nedbatchelder.com/text/bigo.html
[1] https://nedbatchelder.com/blog/201711/toxic_experts.html
[-]
- dawnofdusk a day ago ago
  From what I read in the comments of the first post, the Pyon guy seems very toxic and pedantic, but the rebuttal by Ned isn't great. For example, nowhere in the rebuttal is the pedantic technical detail ever actually described. In fact the prose reads very awkwardly in order to circumlocute around describing it, just repeatedly naming it "particular detail". In my view, the author overreaches: he dismisses Pyon not only for the delivery of his criticism (which was toxic) but also the content of his criticism (why?).
  Ultimately Ned is in the right about empathy and communication online. But as an educator it would have been nice to hear, even briefly, why he thought Pyon's point was unnecessarily technical and pedantic? Instead he just says "I've worked for decades and didn't know it". No one is too experienced to learn.
  EDIT: I just skimmed the original comment section between Pyon and Ned and it seems that Ned is rather diplomatic and intellectually engages with Pyon's critique. Why is this level of analysis completely missing from the follow-up blogpost? I admit to not grasping the technical details or importance, personally, it would be nice to hear a summarized analysis...
  [-]
  - xenotux a day ago ago
    Such is the curse of blogging: when writing a series of posts, authors naturally assume that the readers are as engaged and as familiar with the previous discussion as they are.
    In reality, even regular subscribers probably aren't, and if you're a random visitor years later, the whole thing may be impossible to parse.
  - arp242 a day ago ago
    > Why is this level of analysis completely missing from the follow-up blogpost
    Because that's not what the article is about. It's not about whether Pyon was correct or wrong, it's that they were a dick about it. Their opening words were "you should be ashamed". They doubled and tripled down on their dickery later on.
    And no matter how good your point is or how right you are: if you're a dick about it people will dislike you.
- frabonacci a day ago ago
  Good intro! I first learned Big O from Cracking the Coding Interview since many universities in Europe notoriously skip complexity notations even in basic programming classes. This definitely explains it in a much simpler way.
- monkeyelite a day ago ago
  This is why being able to read and write math is so important. All this confusion can be answered from a one sentence definition.
- jibal 2 days ago ago
  That second link is not about Big-O and is not something anyone should model.
- samwho 2 days ago ago
  Hehe, yes, Ned sent me a lovely email the other day about it. Happy to be doing my part.
- 0xbadcafebee a day ago ago
  Toxic expert here! I hate when blog posts try to teach complex subjects. It's almost always a non-expert doing the teaching, and they fail to do it accurately. This then causes 1) the entire internet repeating the inaccuracies, and 2) the readers make no attempt to do further learning than the blog post, reinforcing their ignorance.
  I'll double down on my toxicity by saying I didn't like the page layout. As someone with ADHD (and a declining memory), I need to be led through formatting/sub-headings/bullets/colored sections/etc into each detail or it all blends together into a wall of text. The longer it takes to make a point (visually and conceptually), the more lost I am. I couldn't easily follow it. The Simple Wikipedia page was more straight to the point (https://simple.wikipedia.org/wiki/Big_O_notation), but reading the "full" Wikipedia page thrusts you headlong into a lot of math, which to me signifies that this shit is more complex than it seems and simplifying it is probably a bad idea.
  [-]
  - xenotux a day ago ago
    > Toxic expert here! I hate when blog posts try to teach complex subjects. It's almost always a non-expert doing the teaching, and they fail to do it accurately. This then causes 1) the entire internet repeating the inaccuracies, and 2) the readers make no attempt to do further learning than the blog post, reinforcing their ignorance.
    Ask yourself why. The usual answer is that top experts either can't be bothered to create better content, or they actively gatekeep, believing that their field must remain hard to learn and the riff-raff must be kept out.
    I think the first step is to accept that laypeople can have legitimate interest in certain topics and deserve accessible content. The remedy to oversimplified explanations is to write something better - or begrudgingly accept the status quo and not put people down for attempts that don't meet your bar.
    It's also good to ponder if the details we get worked up about actually matter. Outside the academia, approximately no one needs a precise, CS-theoretical definition of big-O notation. Practitioners use it in a looser sense.
    [-]
    - xigoi a day ago ago
      > Ask yourself why. The usual answer is that top experts either can't be bothered to create better content, or they actively gatekeep, believing that their field must remain hard to learn and the riff-raff must be kept out.
      Before asking why, ask if. There are good articles about complex topics, they just get drowned out by bad articles.
    - sgarland 20 hours ago ago
      As a semi-toxic semi-expert, I think one of the main reasons there is precious little blog-sized content that goes deep is that it’s hard to do. Most subjects, in or out of tech, can go incredibly deep on many, many points.
      The more detail that’s included, the harder it becomes to write less. If you ask me to write 500 words on indexing strategies for an RDBMS, I’ll probably manage, but very little will be explained. If you ask me to write 5000 words, there’s no way I’ll hit that target, because I’ll want to explain every tangent, every fundamental, and so on.
      The single best blog [series] that manages to pull this off, IMO, is Jeremy Cole’s series on InnoDB [0]. It’s not short taken as a whole, but each article is short enough that it (probably) meets people’s attention spans. I think the thing that makes it so great to me is that it’s written assuming that the reader already understands certain concepts. For example, he doesn’t spend multiple paragraphs explaining what a B+tree is; he links to the Wikipedia page, and then describes how InnoDB uses them.
      In contrast, when I write something, I usually find myself assuming that the reader doesn’t have some fundamental knowledge. I usually try to either put it in a section that’s easily skipped, or in a pop-out toggle, but either way, it’s there.
      To me, these fundamentals are incredibly important, because a. they’re how I understand things b. I don’t feel that you can operate something to its fullest capability (let alone fix it) if you don’t know how the thing works.
      Re: gatekeeping, to me the perfect example is the Linux kernel. Torvalds is notoriously unfriendly, and while you can judge that however you’d like, the man has a massive weight on his shoulders, and letting code in that doesn’t meet his standards threatens that weight. I get it. Then, I also came of age during the RTFM era, so I’m used to it.
      [0]: https://blog.jcole.us/innodb/
    - 0xbadcafebee 16 hours ago ago
      I remember being taught how to change the brakes on my car. Was explained very simply, it was easy (or so I thought). So I decided to do it to my car next time I needed brake work. And I did, and a year or two later, my brakes failed.
      The thing I was taught wasn't wrong, but it left out important details. There are very specific steps involved in cleaning parts, applying the right lubricant, not applying lubricant, aligning parts, not forcing things, doing the same change on both sides, torque specs, bedding steps. If you don't do them, the brakes just fail again. The person who taught me didn't go over those important yet intricate details - probably because they were taught by a non-expert too.
      If the choice is "quit my job to start writing lots of expert content", or "accept the status quo", I choose neither. I have my own life to live. But if during the course of living my life, a wave of inaccuracy begins to lap at my door, I will attempt to stem the tide, in my own way. The toxic aspect, I think, is really just the way in which these corrections are given, and definitely I and others could do better with our tone. But to just give up and not give the corrections at all, I think would be disastrous.
      (fwiw, I think HN's overzealous "guidelines" force people to either be toxicly positive or insidiously critical. that's not to say I'm not an asshole, but it's hard to just be normal on this forum without this bizarre culture coming down on you)
    - the_af a day ago ago
      > Ask yourself why. The usual answer is that top experts either can't be bothered to create better content, or they actively gatekeep, believing that their field must remain hard to learn and the riff-raff must be kept out.
      This oversimplifies things.
      It's sometimes a third option: the topic is complex enough it cannot be digested into a ready to consume blogpost without previous work (reading and practicing), which in turn requires previous work, and so on, until you've turned it into an introductory course.
      And that's not gatekeeping or "cannot be bothered to simplify" -- it's a falsehood, a kind of internet new age mantra, that everything can be made simple enough to explain in a blog post. Some things can't. There's nothing elitist about it, since everyone has the mind power to take that introductory course.
      > It's also good to ponder if the details we get worked up about actually matter. Outside the academia, approximately no one needs a precise, CS-theoretical definition of big-O notation. Practitioners use it in a looser sense.
      The article made some genuinely serious mistakes and the author here (graciously, it has to be said) admitted to being wrong about some things like Big O being "the worst case".
      In this cases, maybe the damage could be limited by saying "I know this isn't Big O, this is what some engineers call it but it's actually something different. Because practitioners find useful nonetheless, I will explain it (explanation follows)".
      I found the visual presentation top-notch by the way, it's clear some effort was put into this and it shows!
      [-]
      - branko_d a day ago ago
        There is a lot of value in explaining the gist of something, even if it's not entirely accurate. In my experience, the cases where even that is impossible are very rare, at least when it comes to practical computer programming.
    - bonoboTP a day ago ago
      Ideally it would be both.
      It's a bit like the choice between two doctors. One is very polite, spends an hour by your bedside every day, smiles and holds your hand and encourages you not to give up but has no idea about how to interpret your symptoms, has a horribly confused idea of what's going on but can explain that mess in his head to you in a reassuring tone that feels authoritative.
      The other doc is morose, doesn't smile, the meetings are brief like an interrogation but he knows exactly what's up, spends the time on reading the literature, case studies, phones up other doctors who treated similar illnesses, cuts you and sews you back at the right spot and hands you the exact pills that will get you on your feet in a few months, but at the same time was distant and cold throughout the whole thing and talked to you as if you were a car in the repairshop.
      Ideally, it would be both. Of the two I'd still choose the second guy.
  - wy1981 a day ago ago
    Writing is one of the best ways to learn something. Maybe non-experts learn something by writing about it?
    Don't think the entire internet is repeating inaccuracies. :) I also believe there are readers that attempt to learn further than a blog. A blog post can inspire you to learn more about a topic, speaking from personal experience.
    If there were no blog posts, maybe there would be no HN I believe.
    There should be a place for non-experts. One could remain skeptical when they read blog posts without hating blog posts about complex topics written by non-expert.
  - bonoboTP a day ago ago
    You can't satisfy everyone.
    My experience is the opposite. I hate the colorful books with many little boxes, pictures with captions in floaters, several different font sizes on the page, cute mascots etc, where even the order or reading is unclear.
    Instead I found it much easier to learn from old books made before the 70s-80s, sometimes back into the 40s. It's single column, black on white, but the text is written by a real character and is far from dry. I had such a book on probability and it had a chapter on the philosophical interpretations of probability, which took the reader seriously, and not by heaping dry definitions but with an opinionated throughline through the history of it. I like that much better than the mealy mouthed, committee-written monstrosity that masks any passion for the subject. I'd rather take a half page definition or precise statement of a theorem where I have to check every word but I can then trust every word, over vague handwavy explanations. Often these modern books entirely withhold the formal definitions so you're left in a vague uneasy feeling where you kind of get it, have questions but can't find out "is it now this way, or that way, precisely?". And I'm not so sure that these irrelevant-cute-pics-everywhere books are really better for ADHD at the end of the day, as opposed to distraction free black on white paragraphs written by a single author with something to say.
  - jpfromlondon 20 hours ago ago
    It's an exercise most laymen undertake to reinforce newly acquired or partially acquired knowledge, and a valuable one.
    Where it falls short is the usefulness to others.
  - croes a day ago ago
    Most of the time inaccurate knowledge is better than no knowledge.
    I bet you also have subjects where you use and spread inaccuracies unless you are universal expert
    [-]
    - sgarland 20 hours ago ago
      This varies wildly with the subject, and the degree of inaccuracy.
      If you say, “hash maps are generally O(1), and are a great default choice,” you’re not that wrong (subjectivity aside).
      If you say, “storing integers as strings guarantees precision, so you should prefer them over decimal or float types,” you’re technically correct about the first part, but are enormously wrong for a variety of reasons about the conclusion.
  - IOT_Apprentice 12 hours ago ago
    Perhaps you could take the content and reformat it in a way that is better? I’d be interested in seeing your results. Thanks.
  - AdieuToLogic a day ago ago
    > Toxic expert here!
    I hate that this declaration is both hilarious and one many might suggest I need to prefix with regularly.
    :-D
- the_af 2 days ago ago
  I think the lesson of those articles is not that people should stop trying to correct a misleading or incorrect explanation, but rather, that some people on the internet (like the "expert" described there) are more interested in picking and winning "fights" rather than gently helping the author correct his article. If you see Pyon's comments, he was very aggressive and very internet-troll-like.
  The wrong take from this would be "... and therefore, technical details don't matter and it's ok to be inaccurate."
- the_af 20 hours ago ago
  One caveat here is that the author of the article posted it here in HN for comments -- it's not that someone else did, and this is unfair because HN was never supposed to take a look, etc. They expected a review, otherwise they wouldn't have posted it here.
  HN is not an audience of laypeople (mostly) and will critique the article with a different mindset than a novice that might be impressed by the visuals (which are impressive).
  So I think the reaction is both to be expected and reasonable: HN will critique how correct the explanation is, and point out the mistakes. And there were a couple of fundamental mistakes due to the author not being a subject matter expert.
Beestie a day ago ago
Late to the party but as a layperson who knows just enough about programming to be dangerous, I found the article to be enlightening.
Never assumed that reading it would turn me into an expert at it. Never heard of big O till now but find it a fascinating way to look at algorithm construction.
One more thing: if every article about everything has to be a flawless exhaustive explanation then I suppose only 13 people would even know about relativity. There is plenty of room for content that simply introduces an idea and leaves filling in the gaps to the reader's discretion.
[-]
- the_af 20 hours ago ago
  I think the criticism here is not that the article is incomplete ("not exhaustive") but rather that it makes several fundamental mistakes and doesn't actually describe Big O. It describes a "pop culture" version of Big O, which is not Big O, but calls it Big O.
  It's not useless, and a lot of care seems to have gone into it, but ultimately it's a lay person explaining something they don't firmly grasp, and making a couple of serious mistakes in the process.
  I love the presentation itself though!
ryeats 2 days ago ago
O(1) in many cases involves a hashing function which is a non-trivial but constant cost. For smaller values of N it can be outperformed in terms of wall clock time by n^2 worst case algorithms.
[-]
- namibj 21 hours ago ago
  Arguably this is just a fairly poor way of thinking for quite many practical applications. Notably, memory access latency(/inverse throughput) is roughly between O(log(n)) [ https://news.ycombinator.com/item?id=12385472 ] and O(n^(1/2)) [ https://news.ycombinator.com/item?id=12383275 ](the latter aka O(sqrt(n))).
  For example, in applications where the sorted form can be readily maintained, a decent B+-tree tends to massively outperform a hashmap as soon as you get the streamed/non-indexed side (of what's in a way a lookup join) to opportunistically batch items:
  as when you sort your lookups you can use exponential forward search (compare at exponentially increasing distances from the cursor; once you're past the target, run binary search between this now upper bounds and the previous probe point as lower bound) in your index for each next key to reduce the per-lookup cost to be logarithmic in distance from the last-looked-up key (asymptotically always better than single isolated lookups; in practice tends to cap out at 2x the cost in pathological cases if you respect page locality of B+-tree structures). If you aren't ignorant of your LRU cache set during such, you'll get by with overall significantly fewer memory accesses to e.g. fresh DRAM pages (let alone NAND pages) than with a hashmap setup.
  I've severely sped up a C++ program by replacing an `std::unordered_map` with a `std::vector` (iirc technically a pair of; for SoA purposes) by realizing that I could collect the items unordered; sort the structure; then use binary search instead of hashmap lookups. The function that I modified ran something like 3x faster as a result, and that's without anything like vectorization-friendly structures or such.
- svara 2 days ago ago
  I mean, true obviously, but don't say that too loud lest people get the wrong ideas. For most practical purposes n^2 means computer stops working here. Getting people to understand that is hard enough already ;)
  Besides, often you're lucky and there's a trivial perfect hash like modulo.
  [-]
  - b52_ a day ago ago
    What do you mean? Modulo is not a perfect hash function... What if your hash table had size 11 and you hash two keys of 22 and 33?
    I also don't understand your first point. We can run n^2 algorithms on massive inputs given its just a polynomial. Are you thinking of 2^n perhaps?
    [-]
    - Panzer04 a day ago ago
      n^2 is probably the worst offender of these algorithms - it's fast enough to get into production and slow enough to blow up once you start using it.
      Rockstar infamously wasted five minutes loading GTAV because they had an n^2 algorithm in the startup sequence.
    - LPisGood a day ago ago
      n^2 algorithms on _massive_ inputs seems a little far fetched, no?
      Around one to one hundred billion things start getting difficult.
      [-]
      - vlovich123 a day ago ago
        The challenge with big-O is you don’t know how many elements results in what kind of processing time because you don’t have a baseline of performance on 1 element. So if processing 1 element takes 10 seconds, then 10 elements would take 16 minutes.
        In practice, n^2 sees surprising slowdowns way before that, in the 10k-100k range you could be spending minutes of processing time (10ms for an element would only need ~77 elements to take 1 minute).
  - arp242 a day ago ago
    > true obviously
    Well, it doesn't seem that obvious, at least not to everyone, because several times I've seen people rewrite things to make it O(1) when it was actually slower for the vast majority of cases (or even all cases).
    [-]
    - branko_d a day ago ago
      Even if it's slower for the vast majority of cases, but there are rare cases where the computer would otherwise freeze, that's still a win.
      [-]
      - arp242 a day ago ago
        Your computer will not freeze if your input is small enough no matter what big-O says. In many cases it's guaranteed to be small or small-ish.
        [-]
        branko_d 18 hours ago ago
        > In many cases it's guaranteed to be small or small-ish.
        And in many cases it's assumed to be small, but not guaranteed. That's where the trouble lies.
        A classic example is Windows using a quadratic algorithm for arranging desktop icons that caused severe performance issues when users had many icons on their desktop.
        https://github.com/microsoft/Windows-Dev-Performance/issues/...
  - whatever1 a day ago ago
    N^2 is just two nested loops. It trivially shows up everywhere you have 2D arrays.
    Do I really need to make my code spaghetti if the array sizes are small enough to not impact my speed ?
    [-]
    - Anon1096 a day ago ago
      Iterating over a 2D array using nested loops is an O(N) operation. N is the size of your data, how it's represented in an array is irrelevant.
    - lurking_swe a day ago ago
      if you can guarantee N is small, then it’s probably not an issue.
fracus 2 days ago ago
Having gone through electrical engineering, I always felt that Big O notation was never properly explained unless I happened to miss every time it was introduced. It always felt like it was just hand waived as something we should already know. I'm curious what level of math or computer science is usually responsible for introducing it.
[-]
- daemonologist 2 days ago ago
  My university had an "algorithm analysis" class (required for CS majors) which covered it along with various kinds of proofs. That was a junior/senior year class though; I think it was really expected that you were already somewhat familiar with the concept by the end of your first year (via osmosis I guess).
- leni536 2 days ago ago
  A function f(x) is said to be O(g(x)) if f(x)/g(x) is bounded, that is there is some C so that for every x, f(x)/g(x) < C .
  In computer science f(x) is often some complexity function, like number of some specific operations when running an algorithm to completion.
  [-]
  - blt a day ago ago
    This needs to be refined: f(x) is O(g(x)) if there exists some X >= 0 such that f(x)/g(x) is bounded for all x > X.
    Otherwise, we cannot say that 1 is O(x), for example.
  - mvdtnz a day ago ago
    What an asshole way of explaining it. For practical purposes (don't @ me if this is technically wrong, I don't give a shit) it's a description of how the cost of an algorithm grows with the size of the input. For example if the cost doubles every time the input increments then it's O(n^2). If the cost increases in step with the input size then it's O(n). Simples.
    [-]
    - badosu a day ago ago
      I don't have a computer science education, but I had a math one. I found it a great explanation, but I understand why others would feel it's not.
      It definitely is not an "asshole" way of explaining it though.
    - tauroid a day ago ago
      Okay unless you're doing numerical analysis where they use it for the growth rate of the error term, which has nothing to do with cost or algorithms
    - xigoi a day ago ago
      This is not only “technically” wrong, but completely wrong. O-notation has absolutely nothing to do with algorithms.
      [-]
      - writebetterc a day ago ago
        Let's not over correct, of course O-notation has something do with algorithms for the working programmer.
    - lern_too_spel a day ago ago
      If the cost doubles for each increment in the input size, it's O(2^n).
      [-]
      - mvdtnz a day ago ago
        Right you are, my bad.
- anthomtb 17 hours ago ago
  Big O was introduced in my Data Structures and Algorithms (DSA) class which was nominally a second year undergrad class. In practice, most Computer Science (CS) students were far enough advanced to take DSA in their first year.
  I took DSA as a third-year Electrical and Computer Engineering student. I recall maybe 15 minutes of exposition on the topic and exam questions of a near-throwaway nature (ergo: write this tree parsing algorithm, 95 points for your code's correctness, 5 points for explaining your algorithm's Big O complexity). It did seem like something that CS professors and teaching assistants considered either trivial or self-explanatory.
- ordu 19 hours ago ago
  > It always felt like it was just hand waived as something we should already know.
  I learned about Big O notation while learning calculus, along with little-o.
- LPisGood a day ago ago
  That’s because you could define it in a lot of different ways. For example, if you define it as the number of steps it takes some Turing machine representing some algorithm to halt, then there is no such thing as a logarithmic time algorithm; O(lg n) = O(1)
  [-]
  - umanwizard a day ago ago
    Why is that?
    [-]
    - LPisGood a day ago ago
      Log(n) grows slower than n. That means there is some number N where the process terminates in fewer than N state transitions. But in a Turing machine, you can read at most one new cell on the tape with each state transition. Therefore the algorithm halts in constant time.
      [-]
      - Droobfest a day ago ago
        What is the relationship between n and N? log(n) is still unbounded so I can’t make sense of your comment.
        [-]
        LPisGood a day ago ago
        There is no relationship between n and N, of course. N is some fixed constant.
        Log(n) is unbounded, yes, but if a Turing machine halts in sub-linear time then there is some length input for which that TM halts before it reads every cell of the input tape. Therefore, it doesn’t matter how the size of the input grows, that TM must halt in constant time. Make sense?
        [-]
        Droobfest a day ago ago
        I’m sorry but no. Maybe it’s my ignorance of TM’s, but O(log n) doesn’t read all input by definition. It doesn’t follow that it is therefore _independent_ of the input size.
        What makes a/your TM special that this is the case?
        I don’t mean to take too much of your time though, maybe I’m too dumb for this.
        Edit: I can sort of see how O(log n) is impossible or at least O(n) in a TM, but to reduce it to O(1) makes no sense to me
        [-]
        umanwizard 19 hours ago ago
        Let me try to explain the proof in more detail. Assume the TM is sublinear. Then there exists some fixed N such that for every input of size n <= N, the number of steps the TM halts in is less than N.
        Now consider any input I, and let I_N be the subset of I that fits in N characters (I.e. I_N is the same as I except with all characters past the Nth replaced by blanks).
        Then by our earlier statement defining N, the TM must halt on I_N after fewer than N steps. But the TM’s behavior on I for the first N steps must be the same as its behavior on I_N, because it can’t possibly have read any different characters (since in N steps it can only have read at most the first N characters, and those two inputs are the same on that interval). Thus the machine must halt in less than N steps on I. But I was chosen arbitrarily, so we’ve shown that the machine halts in less than N steps on any input, and recall that N was a fixed constant.
        [-]
        Droobfest 6 hours ago ago
        This looks logically sound to me, thanks!
        LPisGood 20 hours ago ago
        Remember that TM’s read their input left to right. Therefore if there is some fixed number N where the TM halts on all length N inputs, then it would halt in the same number of steps if input where length N+1 because TM halts before it can read the additional input.
- bongodongobob 2 days ago ago
  Comp sci 101. Learned it my first year. There really isn't anything to it. It just describes how the number of operations grow as the number of inputs in your algorithm increases. It looks scary, but it's very simple and straightforward.
  [-]
  - nwatson a day ago ago
    It can get a lot more complicated though. At times there are more parameters to an algorithm's complexity than just `n`. E.g., the parameters for big-O might be `n`, `K`, and `ρ`, and some expressions in the big-O might involve combinations of those parameters.
    In such cases as well, one might know for a specific application of the algorithm that, for example, `ρ` is bounded, and so the big-O becomes more influenced by the `n` and `K`.
    [-]
    - bongodongobob a day ago ago
      Yeah that's fair. Calc 1 should be enough to understand that.
  - Sankozi a day ago ago
    You are commenting under blog post that tried to explain it and got it wrong. It is not simple.
    Examples:
    - algorithm with O(2^n) complexity might be faster for your problem than O(1) algorithm
    - the same algorithm may be O(2^n) or O(n³) depending how you define size function
    This is not straightforward.
    [-]
    - fxwin a day ago ago
      that's why he said it describes how runtime behaves as input size changes, not how it behaves for specific values
    - bongodongobob a day ago ago
      It is though.
- jayd16 a day ago ago
  The Discrete Math class in the CS track is where I got the bulk of it.
ChicagoDave 20 hours ago ago
I liked the presentation (right or wrong), but wanted to point out I’ve been in IT as a coder and architect for 40 years and never once needed to know big O notation.
Which I think is the complaint here. The article author might have made an effort to connect it’s significance to Internet solutions like Search.
I assume he didn’t because that’s already been done.
[-]
- mock-possum 18 hours ago ago
  fwiw I’ve been writing code for 2 years and I have occasionally benefitted from half-remembering the concepts that big O notation describes (even though my memory of the proper terminology is pretty shoddy, it’s been a long time since we covered it in college)
  When new releases of website features have resulted in notably poor performance, and profiling has revealed a particular function taking time on the order of seconds to complete (seconds are an eternity in UI land) it’s important to be able to reach into code that processes big chunks of data, and identify loops-within-loops-within-loops that could be rearchitected to play out more efficiently.
  Knowing that you’re going from O(n^2) to O(2n) is really just a matter of jargon - knowing that your list of results now displays in 1/10th the time, and why, is still pretty important. The underlying logic/math is the important bit to have a concept of, imo, big O is just a fun way to talking about it.
- echelon 20 hours ago ago
  Every engineer should be familiar with at least the basics of computational complexity.
  If you're stacking loops, you ought to know what that does.
  You also really ought to also what your core data structures are, how they work under the hood, and how the different operations on them consume space and time.
  If you don't familiarize yourself with at least those bare basics, you're doing yourself and your customers (and your future maintainers) a disservice.
  [-]
  - ChicagoDave 16 hours ago ago
    I’ve always done load tests, cured bad code, re-tested. Static analysis can discover most of these kinds of problems. It’s entirely possible I’ve been resolving big O problems intuitively over the years.
dekhn a day ago ago
What blows me away is that there are quantum calculations which grow as O**7 with respect to the number of atoms and scientists are fine with running them because N is small, computers get faster and more memory all the time, and the results are valuable.
(I'm not a computer science expert, so if I used the O(n) terminology differently from the standard, sorry).
[-]
- JohnKemeny a day ago ago
  You can simply say grows as n⁷. Although people will understand what you mean by saying O(n⁷), O is an upper bound, not a lower bound.
  What you should say if you want to be correct is it grows as Ω(n⁷).
- wasabi991011 a day ago ago
  "Fine with running them" is a bit of a stretch, though the gist is correct.
  There's a whole hierarchy of quantum chemistry algorithms where accuracy is traded with asymptotic runtime, I think starting at N*^3 and going all the way to N!.
totisjosema 2 days ago ago
Love the visualizations! For someone that learned these algorithms years ago, it is still nice to see them visualized!
1zael 2 days ago ago
The dynamic visualizations are a massive help in helping understanding the content. Please make more lessons like this!
[-]
- samwho 2 days ago ago
  I’m glad you like them! <3
  [-]
  - primitivesuave a day ago ago
    I just want to add to the chorus that your visualizations are outstanding.
    [-]
    - herodoturtle 3 hours ago ago
      Came here to say the same.
      Don't want to start a separate comment thread repeating what's been said before, but I felt compelled to contribute to the crescendo of this one instead.
      These visualisations are simply great.
      Programmer of 30 years here. Studied Big O formally around 20 years ago. Loved the refresher. But beyond all that, I found the in-line animation examples were just superbly executed.
      I would have absolutely *loved* this style of teaching in my undergraduate algorithms class. It is tangible and practical - not to mention fun.
      Kudos to the author.
  - ygouzerh a day ago ago
    So impressive!
nilsherzig a day ago ago
Loving the interactive visualizations. Reminded me of https://kelseyc18.github.io/kademlia_vis
Can someone recommend a collection of similar pages?
[-]
- curious_gustaf 11 hours ago ago
  https://explorabl.es/all
bubblyworld a day ago ago
If you work with asymptotic notations enough you realise they can often (loosely) be used like numbers - they obey certain arithmetic laws, and it's not uncommon for analysts to write stuff like 2^o(1), for instance. Take that rabbit hole further and you get the hyperreals! A number system that at once generalises real numbers, asymptotic notation and infinitesimal/infinite quantities =D
https://en.m.wikipedia.org/wiki/Hyperreal_number
(for anyone who cares, the ultrafilter construction kinda bakes in the laws of asymptotics into its definition of equality, if you squint)
MDGeist 2 days ago ago
Every time there is a thread on big O notation I'm hoping someone is going to explain something that will make it clear how it relates to the anime The Big O. I'm still trying to understand what happened in that show.
[-]
- RajT88 2 days ago ago
  (Chugs 4 beers)
  OK, so. So, it's like you mixed together Pacific Rim, Dark City and The Matrix all into the same show. In that order.
- scuff3d a day ago ago
  All I ever remember about that show is it's kinda Batman with a giant robot, and the end gets real weird.
pss314 a day ago ago
I found https://www.bigocheatsheet.com/ and https://visualgo.net/en helpful in the past.
cat-whisperer 2 days ago ago
I've found that one of the most effective ways to internalize big O notation is to relate it to everyday experiences.
[-]
- rune-space a day ago ago
  Take a few runs through sorting a growing deck of cards and anyone will be able to tell you why bubble sort sucks.
csk111165 a day ago ago
Awesome, how did you make those interactive code execution in that box??
lenkite 2 days ago ago
These large numbers in JS are bigger than Number.MAX_SAFE_INTEGER. They all loose considerable precision.
[-]
- samwho 2 days ago ago
  They do! Part of me wanted to go back and change the sum function to something else but in the end I figured it’s not important.
dawnofdusk 2 days ago ago
Whenever I read content like this about Big O notation I can't help but think the real solution is that computer science education should take calculus more seriously, and students/learners should not dismiss calculus as "useless" in favor of discrete math or other things that are more obviously CS related. For example, the word "asymptotic" is not used at all in this blog post. I have always thought that education, as opposed to mere communication, is not about avoiding jargon but explaining it.
[-]
- crystal_revenge 2 days ago ago
  > students/learners should not dismiss calculus as "useless"
  This seems to be quite a bit of a strawman to me.
  ML is such a major part of the field today and at a minimum requires a fairly strong foundation in calc, linear algebra, and probability theory that I earnestly don't believe there are that many CS students who view calculus as "useless". I mean, anyone who has written some Pytorch has used calculus in a practical setting.
  Now in pure software engineering you will find a lot of people who don't know calculus, but even then I'm not sure any would decry it as useless, they would just admit they're scared to learn it.
  If anything, I'm a bit more horrified at how rapidly peoples understanding of things like the Theory of Computation seem to be vanishing. I've been shocked with how many CS grads I've talked to that don't really understand the relationship between regular languages and context free grammars, don't understand what 'non-determinism' means in the context of computability, etc.
  [-]
  - dawnofdusk 2 days ago ago
    >This seems to be quite a bit of a strawman to me.
    Not really, if you ever listen to CS undergrads or people in non-traditional schooling (bootcamps, etc.) talk about software engineering this opinion is essentially ubiquitous. People interested in ML are less likely to hold this exact opinion, but they will hold qualitatively identical ones ("do you really need multivariable calculus/linear algebra to do ML?"). It is precisely because people (primarily Americans) are scared to learn mathematics that they rationalize away this fear by saying the necessary mathematics must not be essential, and indeed it is true that many people get away without knowing it.
    [-]
    - the_af 2 days ago ago
      > non-traditional schooling (bootcamps, etc.)
      It's probably unfair to qualify those as "computer science education" though...
  - marcosdumay 2 days ago ago
    Most people from almost every specialty mostly dismiss calculus as useless, don't even remember linear algebra is a thing, and never learn or forgets everything about statistics. (But the reaction to probability varies a lot.)
    I have no idea what's up with those disciplines, but it's an almost universal reaction to them. Unless people are very clearly and directly using them all the time, they just get dismissed.
    [-]
    - samwho 2 days ago ago
      Can’t imagine what it might be.
      [-]
      - marcosdumay 17 hours ago ago
        If that's sarcastic, do you have any idea why probability is different?
- samwho 2 days ago ago
  Part of the problem is that a lot of people that come across big O notation have no need, interest, or time to learn calculus. I think it's reasonable for that to be the case, too.
  [-]
  - ndriscoll 2 days ago ago
    The thing is, this is like saying lots of mechanical engineers have no need, interest, or time to learn derivatives; they just want to get on with "forces" and "momentum" and cool stuff like "resonance". Saying you have no interest in learning limits and asymptotes but you want to know what people are talking about when they mention asymptotic analysis doesn't make sense.
    If you want know what calculus-y words mean, you're going to need to learn calculus. People use calculus-y words to quickly convey things professionally. That's why it's a "topic" for you to learn. The thing under discussion is a limit.
    [-]
    - samwho 2 days ago ago
      I replied to this effect to someone else in this thread, but I think it's reasonable for people to want to have an idea of what big O is for (in software engineering!) without having to have a grounding in calculus. The notation is useful, and used, without it regularly.
      [-]
      - dawnofdusk 2 days ago ago
        It's reasonable but essentially every "common misconceptions about Big O" is because people didn't have the necessary notions in calculus. For example, the fact that O(x^2) can be practically faster than O(x), due to the size of constants/subdominant terms, is confusing only if you never properly learned what asymptotic behavior is.
        The practical question is whether you think it's ok to continue propagating a rather crude and misunderstanding-prone idea about Big O. My stance is that we shouldn't: engineers are not business people or clients, they should understand what's happening not rely on misleading cartoon pictures of what's happening. I do not think you need a full-year collegiate course in calculus to get this understanding, but certainly you cannot get it if you fully obscure the calculus behind the idea (like this and uncountable numbers of blogpost explainers do).
      - ndriscoll 2 days ago ago
        Given the various ways people in this thread have pointed out you lack fluency with the notation, why do you think it reasonable for people to want to learn it without learning the concepts it's describing?
        [-]
        samwho 2 days ago ago
        I’m not sure that’s quite my position. Happy to cede that I lack fluency, and I appreciate your time and the time others have given to help me understand.
  - oooyay 2 days ago ago
    I find myself in the odd position of disagreeing with you and the person you responded to.
    First, software engineering doesn't just consist of Computer Science majors. We have a lot of people from accounting, physics, or people who have no degree at all. Teaching this concept in CS fixes very little.
    Second, and complimentary to the first, is that asymptotic behavior is derivative of the lessons you learn in Calculus. You can't really full understand it beyond a facade unless you have a rudimentary understanding of Calculus. If you want to put this theory to the test then ask someone with a functional understanding of Big-O to write asymptotic notation for a moderately complex function.
    I don't have a degree and in order to really understand asymptotics (and Big-O as well as the others) I read a primer on Calculus. It doesn't take a ton of knowledge or reading but a decent background is what will get you there. I do think we need a lot better continuing education in software that goes beyond O'Reilly style technical books that could fill this gap.
  - growthwtf 2 days ago ago
    I'm not the original commentator, that makes a lot of sense! I had assumed there was a huge overlap, personally.
    [-]
    - samwho 2 days ago ago
      I think it's pretty common for folks to enter the software field without a CS degree, start building apps, and see big O notation without understanding what it is. These people have jobs, deadlines, they want to ship features that make peoples' lives easier. I'd bet many of those people don't care so much about calculus, but a quick intro to what all this big O nonsense is about could help them.
- lenkite 2 days ago ago
  Apparently Big-O notation was invented by Paul Bachmann in 1894. Its not a johnny-come-lately.
tveyben 2 days ago ago
Beautiful - I sent a ping and hope it was delivered and have a small dopamine kick ;-)
[-]
- samwho 2 days ago ago
  It was, thank you so much <3
clbrmbr 20 hours ago ago
I would have liked to see n log n
[-]
- JohnKemeny 20 hours ago ago
  n log n is a lot like n, in that if n less than a trillion, n log n is at most 40•n.
swimwiththebeat 2 days ago ago
Your blog always has such informative visualizations that make these concepts easy to digest! Which tools do you use to create them?
[-]
- samwho 2 days ago ago
  All the code is at https://github.com/samwho/visualisations. I didn’t use much more than basic JS, CSS, and HTML for this one. The only third party dependency, iirc, was one to parse JavaScript into an AST.
IceDane 2 days ago ago
Every time I see an introduction to Big-O notation, I immediately go look for a section about the worst case to find the most common, fundamental misconception that is present in nearly every such text.
Lo and behold:
> Because it's common for an algorithm's performance to depend not just on the size of the input, but also its arrangement, big O notation always describes the worst-case scenario.
This is not true. I'm on my phone, so I can't be bothered to explain this in more detail, but there is plenty of information about this online.
[-]
- tromp 2 days ago ago
  O() doesn't necessarily describe worst-case behaviour. It just provides an asymptotic upper bound. So a quadratic sorting algorithm can still be said to be O(n^3), although that might be a little misleading.
  [-]
  - didibus a day ago ago
    In math that’s true, but a practitioner software engineer uses it to mean the common "tightest" upper bound we can easily prove for the case we care about (worst, amortized, average, etc.), and not the actual tightest possible bound.
    [-]
    - monkeyelite a day ago ago
      It has a meaning already - and that is the sets of function which satisfy the condition. So O(n^2) is a subset of O(n^3)
      [-]
      - didibus 16 hours ago ago
        And to add to my other comment, I don't think it came to mean that out of stupidity either.
        As far as I know, and correct me if I'm wrong, there isn't a mathematical term for an approximate tight upper bound that's based off vague criteria of "fast enough".
        In practice, people needed a way to say something like, let's not waste time getting all precise here, what's the approximate close-ish tight bound that's obvious, alright, that will work, let's use it.
        Big Thetha implies proving the lower bound, and generally specifying the exact upper bound. So it was a worse term to use to describe the above, I think people settled on BigO, because you end up with some upper bound, and not necessarily the tightest one, but you lend on a tight-enough upper bound that it serves your practical purpose of choosing an algorithm to use in your application.
        And so it became BigO, all for a lack of a better word, and eventually that became common parlance in those circles.
        [-]
        monkeyelite 14 hours ago ago
        > there isn't a mathematical term for an approximate tight upper bound that's based off vague criteria of "fast enough"
        In every mathematical writing they come up with definitions to precisely describe this kind of things - that’s what the big oh/theta aim to do - formalize what we mean by “bounded by in long term growth”.
      - didibus 16 hours ago ago
        > It has a meaning already
        Natural language tends to work like that, many words take on different meaning in different context, and the dictionary often needs to add alternate definitions as language evolves.
        If you come to a software engineering interview and take the mathematical meaning and answer O(n^3), you won't get the job. You need to know the linga franca of the trade, not just the textbooks.
        I know this can be frustrating, I just wouldn't really recommend staking your ground on this mountain.
        The other mistake, is to assume the interviewer doesn't know better, and you do, and try to explain their mistake. A lot of them know very well, you simply didn't understand correctly the ask because you got confused with the context of the problem statement, and it shows you didn't adapt to the working environment successfully.
        And if you think that's bad, wait till you talk to product ;)
        [-]
        monkeyelite 14 hours ago ago
        You’re describing a situation of poor communication - not a fault for using the precise definition of Big oh.
      - arethuza a day ago ago
        Any everything is O(BB(n))
  - samwho 2 days ago ago
    I'd love to hear more about how a quadratic sorting algorithm could be said to be O(n^3). That isn't intuitive to me.
    [-]
    - mfsch 2 days ago ago
      Technically the big O notation denotes an upper bound, i.e. it doesn’t mean “grows as fast as” but “grows at most as fast as”. This means that any algorithm that’s O(n²) is also O(n³) and O(n⁴) etc., but we usually try to give the smallest power since that’s the most useful information. The letter used for “grows as fast as” is big Theta: https://en.wikipedia.org/wiki/Big_O_notation#Use_in_computer...
      [-]
      - samwho 2 days ago ago
        Ahh I see, thank you!
    - ndriscoll 2 days ago ago
      Because |n^2|≤|n^3| as n→∞, so if |f| ≤ A|n^2| as n→∞, then |f| ≤ A|n^3| as n→∞.
- bonoboTP 2 days ago ago
  Yes. The notation can be used to describe how fast any function grows. That function may be a worst-case runtime wrt. input length, the average-case runtimve wrt. input length, or anything else, really.
  But the most common usage is indeed worst-case analysis, especially in intro courses.
  This is also wrong in the article:
  > You may sometimes see the Greek letter "theta", Θ, instead of the letter O in big O notation. This is used when the best and worst case have the same order of growth. We can't use it for bubble sort because the best case is O(n) and the worst case is O(n^2).
  It conflates two different, orthogonal concepts: upper vs lower bounds on the one hand, and best vs worst case analysis on the other.
  The phrase "the best case is O(n)" already contradicts "big O notation always describes the worst-case scenario". They clearly use it for best case right there in the article.
  [-]
  - ayhanfuat 2 days ago ago
    > But the most common usage is indeed worst-case analysis, especially in intro courses.
    It is also the most sensible one. Best case version is basically useless, if you are lucky you may end up with O(1) but that doesn't tell you anything. Average case could be good but it requires you to estimate the distribution.
    [-]
    - namibj 20 hours ago ago
      Or, instead of "estimate the distribution" look at what conceptual properties of the instance the big-O cares about.
      And then derive big-O in terms of such properties.
      See for example Tetris-LoadBalanced and Tetris-Reordered (on top of Tetris-LoadBalanced) for such works: https://arxiv.org/abs/1909.12102
    - tialaramex a day ago ago
      You'd think, but one of the world's most popular C++ stdlib implementations (Clang's libcpp) shipped the actual Quicksort, which has O(N squared) worst case as its default sort until just a few years ago. Because the typical case for Quicksort is fast, so, most C++ programmers didn't notice.
      And I really do mean just a few years. Like, Joe Biden was President.
  - samwho 2 days ago ago
    I’d love to hear how those two things differ, and how I could include it in the post in a way that won’t be too scary for a beginner to the topic.
    That section in particular has already seen several revisions based on other peoples’ comments.
    [-]
    - bonoboTP 2 days ago ago
      Basically, for each input length, you can have different runtimes depending on the contents of that input. If you always pick the highest runtime for each length, that gives you a specific function. You can then derive either a lower or an upper bound for that function. Same for the best case.
      You can state four combinations in their common sense intuition as
      - even in the worst case, the runtime grows only at most as fast as n^2 (modulo multiplication and addition)
      - in the worst case, the runtime can grow at least as fast as n^2
      - in the best case, the runtime can grow only at most as fast as n^2
      - even in the best case, the runtime grows at least as fast as n^2
      One can imagine it as not just a single curve on a function plot, but a band, a range. Then the top line of this range can be bounded from above or below, and the bottom can also be bounded from above or below. The common case is when you give a bound for the top curve, from above.
      How to work this into the article, I don't know, sorry.
      [-]
      - samwho 2 days ago ago
        I appreciate you taking the time to explain this to me, thank you.
        I feel like the way I have it is a reasonable, if strictly incorrect, simplification for someone new to the material.
        When I write these, I have the newly-hired bootcamp grad as my target audience. Someone who has been taught the basics for how to write code for the web and work as a junior within a company, but has no computer science background. I want them to understand enough to make sense of what they see.
        [-]
        bonoboTP 2 days ago ago
        Maybe like this:
        "It's common for an algorithm's performance to depend not just on the size of the input, but also its arrangement. In such cases we typically use big O notation to describe the worst-case scenario. However, note that we may perform similar analysis on the best case or the average case as well."
        But here:
        > You may sometimes see the Greek letter "theta", Θ, instead of the letter O in big O notation. This is used when the best and worst case have the same order of growth. We can't use it for bubble sort because the best case is O(n) and the worst case is O(n^2).
        This is plain false, the worst-case runtime of bubble sort is indeed Θ(n^2). Either delete it, or write about the upper bound / lower bound thing.
        [-]
        samwho 2 days ago ago
        I made some wording changes based on what you wrote, I'd appreciate your feedback.
        I'm struggling to understand how the part about big theta is false. Is the best case for bubble sort not O(n)? I've deleted the section for the time being.
        [-]
        umanwizard 2 days ago ago
        The difference between big theta and big O has nothing to do with worst vs. average case. They are two completely orthogonal axes.
        You can talk about the big-O of the average case, the big-theta of the average case, the big-O of the worst case, or the big-theta of the worst case.
- lordleft 2 days ago ago
  Are you referring to the fact that Big O has notation and concepts revolving around best-case/average case scenarios, as well as worst-case scenarios?
  [-]
  - umanwizard 2 days ago ago
    It is the same notation in either case. The worst-case runtime of quicksort is O(n^2). The average runtime is O(n * log(n)).
- samwho 2 days ago ago
  It’s okay, you’re not the first to point it out to me!
  I’d be interested to hear how you would explain it when you’re at a more appropriate device.
  [-]
  - umanwizard 2 days ago ago
    Big O notation can be used to describe any function, whether worst case runtime of an algorithm, average case runtime of an algorithm, the price of eggs in China as a function of time, or anything else.
  - ndriscoll 2 days ago ago
    In order to write f(n) is O(g(n)), you already had to smuggle in the idea that you were talking worst case before even talking about O(*). What is f? Typically it is the max of "steps taken" over all input problems of size n. i.e. the worst case.
    The O(g(n)) part says f is asymptotically bounded by g(n) up to some constant factor.
    [-]
    - samwho 2 days ago ago
      This is how you would explain it?
      [-]
      - ndriscoll 2 days ago ago
        If you're just trying to explain what f~O(g) means, then it makes sense to talk in terms of functions and asymptotes like you would in an intro to calculus, sure. Then, separately, computer scientists are interested in f(n) = max_{|S|=n} steps_required(program, S) where S is some input/problem type with some notion of size.
        The fact that f looks innocuous but is secretly actually this complicated thing is perhaps the stumbling block for people. The actual O(*) part is just "line stays at or below other line once you go far enough out" (with a slight tweak to allow you to pre-stretch the bounding line).
        [-]
        samwho 2 days ago ago
        I'm aiming this at junior software engineers without math or computer science backgrounds. My own computer science background is very old at this point, and I never did have the math. So I'm not surprised I've made mistakes that are being pointed out by folks that live and breathe this stuff. I want to impart enough useful (if not strictly correct) information to folks to make a difference in their day jobs.
  - mminer237 2 days ago ago
    I would just say "almost always" to cut off the pedants. You do use the same notation to describe the best-case and average-case as well; people just don't care about those as often.
    [-]
    - jibal a day ago ago
      Pejoratively referring to people as "pedants" simply because they know what they are talking about is quite the ad hominem fallacy.
      As for what people care about, it depends on what they're doing and what they know about the input distribution. If you're concerned about DOS attacks or response time in a game, then worse case is of primary importance. But if you're paying for wall time then you measure a hash table lookup by its O(1) average time, not its O(n) worst case time.
    - samwho 2 days ago ago
      I did push up a rewording that hopefully lands better. Fingers crossed.
  - IceDane 2 days ago ago
    TLDR: Different contexts, different analyses. You can do a best-case, average-case and worst-case analysis. A linear search is O(1) in the best case, for example.
    Maybe this comes across as nitpicky to some, but that doesn't make the text any less incorrect, unfortunately. It's an extremely common misconception, however, and I would say that a huge proportion of students that go through an introductory course on this subject come out thinking the same thing.
    > It’s okay, you’re not the first to point it out to me!
    It would be nice if you corrected it - there are already too many people making this mistake online.
    [-]
    - samwho 2 days ago ago
      So your contention is with the word “always”? It doesn’t always mean worst case? I got told off by someone else for _not_ saying this.
      I really just want to find the way of describing this that won’t net me comments like yours. It is very disheartening to spend so much time on something, and really try to do the topic justice, to be met with a torrent of “this is wrong, that’s wrong, this is also wrong.” Please remember I am a human being trying to do my best.
      [-]
      - blt 2 days ago ago
        The definition of big O notation is pure math - there is nothing specific to analysis of algorithms.
        For example: "the function x^2 is O(x^3)" is a valid sentence in big-O notation, and is true.
        Big O is commonly used in other places besides analysis of algorithms, such as when truncating the higher-order terms in a Taylor series approximation.
        Another example is in statistics and learning theory, where we see claims like "if we fit the model with N samples from the population, then the expected error is O(1/sqrt(N))." Notice the word expected - this is an average-case, not worst-case, analysis.
        [-]
        samwho 2 days ago ago
        I should have probably been a lot more clear about who the target audience of my post was.
      - bonoboTP 2 days ago ago
        > It is very disheartening to spend so much time on something, and really try to do the topic justice, to be met with a torrent of “this is wrong, that’s wrong, this is also wrong.” Please remember I am a human being trying to do my best.
        HN (and forums) can be blunt in this way (cf. Cunningham's Law). When you put something up, most of the reaction will be about how it's wrong. Things being correct is just the default, people won't comment on how correct something is. Try to take it not personally and more like a helpful piece of info. Ideally you'd want all statements to be correct. Textbook authors often incentivize hardcore nitpicking by mentioning the submitters in the acknowledgments or even cash. Try to take it in that manner.
        [-]
        samwho 2 days ago ago
        That framing really helps, thank you. <3
      - IceDane 2 days ago ago
        > So your contention is with the word “always”? It doesn’t always mean worst case? I got told off by someone else for _not_ saying this.
        Using "always" is definitely wrong, but that isn't really what makes this wrong. BonoboTP caught another thing that I missed which tells me that you misunderstand some of these concepts on a somewhat fundamental level. Quote:
        > You may sometimes see the Greek letter "theta", Θ, instead of the letter O in big O notation. This is used when the best and worst case have the same order of growth. We can't use it for bubble sort because the best case is O(n) and the worst case is O(n^2).
        Big O, Ω, and Θ are ways to describe asymptotic bounds on runtime. You can apply them in best, average, or worst case analyses. For example, linear search is O(1), Ω(1), and Θ(1) in the best case but it is O(n), Ω(n), and Θ(n) in the worst case.
        > I really just want to find the way of describing this that won’t net me comments like yours. It is very disheartening to spend so much time on something, and really try to do the topic justice, to be met with a torrent of “this is wrong, that’s wrong, this is also wrong.” Please remember I am a human being trying to do my best.
        I'm sorry.. but what? You posted some learning material on the web which is wrong. I am just correcting you, and in a civil manner, mind you. I'm not exactly sure what you want me to do. Should we all just pretend that there aren't errors in what you posted so we don't hurt your feelings?
        [-]
        samwho 2 days ago ago
        No, I'm grateful for the time you're spending time helping me understand. I'm struggling to come up with wording that satisfies everyone and venting some frustration at you. I'm sorry, that wasn't fair of me.
        It's not clear to me based on what you're written what I should change. With the "always" wording changed, what else do I need to do for this to be correct? Ideally I want to avoid talking about asymptotic bounds.
      - jibal 2 days ago ago
        > I got told off by someone else for _not_ saying this.
        And what made you think they were right?
        > I really just want to find the way of describing this that won’t net me comments like yours.
        Do research. There's a large literature on algorithmic complexity ... maybe start with https://en.wikipedia.org/wiki/Big_O_notation
        > It is very disheartening to spend so much time on something, and really try to do the topic justice, to be met with a torrent of “this is wrong, that’s wrong, this is also wrong.” Please remember I am a human being trying to do my best.
        non sequitur
        [-]
        samwho 2 days ago ago
        Thank you.
sfn42 2 days ago ago
This was a really well done article, great work! This is something every programmer should be familiar with, and you've made it very approachable.
[-]
- samwho 2 days ago ago
  Thank you! I’m really glad you enjoyed it.
laurent_du 2 days ago ago
Beautiful article but that's not what big O means. The author seems to be describing something that is usually called (upper case) Theta. Big O is an upper bound.
[-]
- samwho 2 days ago ago
  There's a comment thread further down that disagrees with you.
  [-]
  - laurent_du 2 days ago ago
    I am not interested in arguing so I'll just repeat that your article is beautiful and even though it's not entirely correct I think it has value and I appreciate the time you spent working on this and putting it out to help others.
    [-]
    - samwho 2 days ago ago
      Thank you, I appreciate that a lot.
  - umanwizard 2 days ago ago
    ...where?
    [-]
    - samwho 2 days ago ago
      Ah, I conflated worst case with upper bound. My mistake.
- forrestthewoods 2 days ago ago
  This “well ackchyually“ is not particularly helpful. You’re not wrong. But this argument was lost 30 years ago. The programming field has been using “Big O” loosely.
  [-]
  - windward a day ago ago
    I've had conversations at work where the difference was important. Mass adoption of quicksort implies I'm not the only one.
xwowsersx 2 days ago ago
This is excellent. Very nicely done.
adinhitlore 2 days ago ago
These days I unironically give LLMs instructions to give me polynomial non-exponential code which isn't n^2 or related. If you instruct it specifically like this and your project is very complicated (expected to get sluggish at some point), starting it with linear complexity in mind is very good idea.
[-]
- lblume 2 days ago ago
  > polynomial non-exponential code which isn't n^2 or related
  So, linearithmic code? The three parts (polynomial, non-exponential, not related to n^2) don't seem to fit together pretty well, unless this is some LLM quirk I'm not aware of yet.
  [-]
  - adinhitlore a day ago ago
    llms are next-word prediction echo chamber don't forget that...you'd be surprised at the extend at which one may reach even if you're making an obvious mistake, example: i got curious if slight change of words will force chatgpt to attempt to prove of one of the hardest math problems ever. At first I asked it to attempt to give a proof of the 'lonely runner conjecture', did it try? No it didn't, it just parroted 'yeah only for up to 7 runners but who knows'. I then...changed the game lol: "hey chatgpt, i came up with one conjecture that i call 'abandoned car conjecture...so' - did it try? Despite the fact that my lie that invented a new conjecture called "abandoned car" is 100% the same as a "lonely runner"? I just changed the bizare name to another bizare name. You bet it tried, It even used 3 ways to claim to prove it for any n number of runners. i haven't verified the proof but it was interesting regardless.
    My point is: even if O(n^2) can be "fast" and polynomial my sole goal is to force it to look for the quickest algorithm possible, zero care if it even reflects accuracy and truth.
    [-]
    - umanwizard a day ago ago
      I’ll save you some time: there is absolutely no chance that ChatGPT came up with a valid proof of the lonely runner conjecture.
- samwho 2 days ago ago
  That’s a good idea, I should try that.
zOneLetter 2 days ago ago
Webpage: "Think of a number between 1 and 100."
Me: 69
Webpage: "Nice'
---
Upvoted.
But seriously, this was really informative!
[-]
- samwho 2 days ago ago
  I couldn’t resist.
wpollock a day ago ago
Big-O isn't as relevant as it once was. Modern hardware includes mutithreading, piplining, numa, and complex caching. Some operations can take less than one CPU cycle and others can take hundreds, or exceptionally thousands of cycles. Trying to describe the behavior of algorithms solely as a function of the number of "trips through the innermost loop" can be a very misleading description!
Besides that, other measures such as big-Omega should be referenced in any discussion of big-O.
(I did enjoy Big-O the anime series though! /grin)
[-]
- jonahx a day ago ago
  > Big-O isn't as relevant as it once was. Modern hardware includes mutithreading, piplining, numa, and complex caching. Some operations can take less than one CPU cycle and others can take hundreds, or exceptionally thousands of cycles.
  Big-O theory was invented precisely to be a measure of computation that is independent of such low-level details. In that sense it is timeless. And any (decent) presentation always includes appropriate caveats like "the constant C may be very relevant for 'smaller' N".
  [-]
  - wpollock a day ago ago
    You're right. I didn't mean to imply that Big-O is useless.
    I'm reminded on an incident in the 1990s, when I was an instructor teaching C at the local telco, when a student showed me a (COBOL) program that had stumped his programming team. It would always hang their computer when it was run. What caught my eye was the main loop was nested seven levels deep, and each loop ran 50,000 iterations. Their computer wasn't hung, they only needed to wait a decade or so for the computation to finish!
    Big-O knowledge made the error obvious to me, and I added that to the curriculum for all programming courses I could.
    [-]
    - namibj 19 hours ago ago
      (on the order of) 110 bits of brute-force isn't anywhere close to "a decade" especially if you aren't throwing severe parallelism at it.
      Edit: we're talking on the order of a million times the current age of the universe if you have a multi-GHz CPU and only take one cycle for each inner iteration.
      [-]
      - wpollock 13 hours ago ago
        50000^7 ÷ 100000000 = ~2.5E17 years. That's a long time!