Everything is correlated (2014–23)

(gwern.net)

224 points | by gmays 18 hours ago ago

103 comments

  • simsla 11 hours ago ago

    This relates to one of my biggest pet peeves.

    People interpret "statistically significant" to mean "notable"/"meaningful". I detected a difference, and statistics say that it matters. That's the wrong way to think about things.

    Significance testing only tells you the probability that the measured difference is a "good measurement". With a certain degree of confidence, you can say "the difference exists as measured".

    Whether the measured difference is significant in the sense of "meaningful" is a value judgement that we / stakeholders should impose on top of that, usually based on the magnitude of the measured difference, not the statistical significance.

    It sounds obvious, but this is one of the most common fallacies I observe in industry and a lot of science.

    For example: "This intervention causes an uplift in [metric] with p<0.001. High statistical significance! The uplift: 0.000001%." Meaningful? Probably not.

    • mustaphah 7 hours ago ago

      You're spot on that significant ≠ meaningful effect. But I'd push back slightly on the example. A very low p-value doesn't always imply a meaningful effect, but it's not independent of effect size either. A p-value comes from a test statistic that's basically:

      (effect size) / (noise / sqrt(n))

      Note that bigger test statistic means smaller p-value.

      So very low p-values usually come from bigger effects or from very large sample sizes (n). That's why you can technically get p<0.001 with a microscopic effect, but only if you have astronomical sample sizes. In most empirical studies, though, p<0.001 does suggest the effect is going to be large because there are practical limits on the sample size.

      • specproc 6 hours ago ago

        The challenge is that datasets are just much bigger now. These tools grew up in a world where n=2000 was considered pretty solid. I do a lot of work with social science types, and that's still a decent sized survey.

        I'm regularly working with datasets in the hundreds of thousands to millions, and that's small fry compared with what's out there.

        The use of regression, for me at least, is not getting that p-gotcha for a paper, but as a posh pivot table that accounts for all the variables at once.

        • refactor_master 5 hours ago ago

          There’s a common misconception that high throughput methods = large n.

          For example, I’ve encountered the belief that just by recording something at ultra high temporal resolution gives you “millions of datapoints”. This then has all sorts of effects on the breakdown of statistics and hypothesis testing (seemingly).

          In reality, the replicability of the entire setup, the day it was performed, the person doing it, etc. means the n for the day is probably closer to 1. So to ensure replicability you’d have to at least do it on separate days, with separately prepared samples. Otherwise, how can you eliminate the chance that your ultra finicky sample just happened to vibe with that day’s temperature and humidity?

          But they don’t teach you in statistics what exactly “n” means, probably because a hundred years ago it was much more literal in nature. 100 samples is because I counted 100 mice, 100 peas, or 100 surveys.

          • clickety_clack 4 hours ago ago

            I learned about experiment design in statistics, so I wouldn’t blame statisticians for this.

            There’s a lot of folks out there though who learned the mechanics of linear regression in a bootcamp or something without gaining an appreciation for the underlying theories, and those folks are looking for low p-value and as long as they get it it’s good enough.

            I saw this link yesterday and could barely believe it, but I guess these folks really live among us.

            https://stats.stackexchange.com/questions/185507/what-happen...

      • pebbly_bread 6 hours ago ago

        Depending on the nature of the study, there's lots of scientific disciplines where it's trivial to get populations in the millions. I got to see a fresh new student's poster where they had a p-value in the range of 10^-146 because every cell in their experiment was counted as it's own sample.

    • tryitnow 4 hours ago ago

      Agreed. However, I think you're being overly charitable in calling it a "pet peeve", it's more like a pathological misunderstanding of stats that leads to a lot of bad outcomes especially in popular wellness media.

      As an example, read just about any health or nutrition research article referenced in popular media and there's very often a pretty weak effect size even though they've achieved "statistical significance." People then end up making big changes to their lifestyles and habits based on research that really does not justify those changes.

    • amelius 8 hours ago ago

      https://pmc.ncbi.nlm.nih.gov/articles/PMC3444174/

      > Using Effect Size—or Why the P Value Is Not Enough

      > Statistical significance is the least interesting thing about the results. You should describe the results in terms of measures of magnitude –not just, does a treatment affect people, but how much does it affect them.

      – Gene V. Glass

    • jpcompartir 9 hours ago ago

      ^

      And if we increase N enough we will be able to find these 'good measurements' and 'statistically significant differences' everywhere.

      Worse still if we did not agree in advance what hypotheses we were testing, and go looking back through historical data to find 'statistically significant' correlations.

      • ants_everywhere 9 hours ago ago

        Which means that statistical significance is really a measure of whether N is big enough

        • kqr 8 hours ago ago

          This has been known ever since the beginning of frequentist hypothesis testing. Fisher warned us not to place too much emphasis on the p-value he asked us to calculate, specifically because it is mainly a measure of sample size, not clinical significance.

          • ants_everywhere 7 hours ago ago

            Yes the whole thing has been a bit of a tragedy IMO. A minor tragedy all things considered, but still one nonetheless.

            One interesting thing to keep in mind is that Ronald Fisher did most of his work before the publication of Kolmogorov's probability axioms (1933). There's a real sense in which the statistics used in social sciences diverged from mathematics before the rise of modern statistics.

            So there's a lot of tradition going back to the 19th century that's misguided, wrong, or maybe just not best practice.

        • energy123 6 hours ago ago

          It's not, that would be quite the misunderstanding of statistical power.

          N being big means that small real effects can plausibly be detected as being statistically significant.

          It doesn't mean that a larger proportion of measurements are falsely identified as being statistically significant. That will still occur at a 5% frequency or whatever your alpha value is, unless your null is misspecified.

          • ants_everywhere 3 hours ago ago

            It's standard to set the null hypothesis to be a measure zero set (e.g. mu = 0 or mu1 = mu2). So the probability of the null hypothesis is 0 and the only question remaining is whether your measurement is good enough to detect that.

            But even though you know the measurement can't be exactly 0.000 (with infinitely many decimal places) a priori, you don't know if your measurement is any good a priori or whether you're measuring the right thing.

    • kqr 8 hours ago ago

      To add nuance, it is not that bad. Given reasonable levels of statistical power, experiments cannot show meaningless effect sizes with statistical significance. Of course, some people design experiments at power levels way beyond what's useful, and this is perhaps even more true when it comes to things where big data is available (like website analytics), but I would argue the problem is the unreasonable power level, rather than a problem with statistical significance itself.

      When wielded correctly, statistical significance is a useful guide both to what's a real signal worth further investigation, and it filters out meaningless effect sizes.

      A bigger problem even when statistical significance is used right is publication bias. If, out of 100 experiments, we only get to see the 7 that were significant, we already have a false:true ratio of 5:2 in the results we see – even though all are presented as true.

    • V__ 10 hours ago ago

      I really like this video [1] from 3blue1brown, where he proposes to think about significance as a way to update the probability. One positive test (or in this analog a study) updates the probability by X % and thus you nearly always need more tests (or studies) for a 'meaningful' judgment.

      [1] https://www.youtube.com/watch?v=lG4VkPoG3ko

    • esafak 2 hours ago ago
    • ants_everywhere 9 hours ago ago

      > Significance testing only tells you the probability that the measured difference is a "good measurement". With a certain degree of confidence, you can say "the difference exists as measured".

      Significance does not tell you this. The p-value can be arbitrarily close to 0 while the probability of the null hypothesis being true is simultaneously arbitrarily close to one

      • wat10000 7 hours ago ago

        Right. The meaning of p-value is, in a world where there is no effect, what is the probability of getting the result you got purely by random chance? It doesn’t directly tell you anything about whether this is such a world or not.

    • tomrod 8 hours ago ago

      This is sort of the basis of econometrics, as well as a driving thought behind causal inference.

      Econometrics cares not only about statistical significance but also usefulness/economic usefulness.

      Causal inference builds on base statistics and ML, but its strength lies in how it uses design and assumptions to isolate causality. Tools like sensitivity analysis, robustness checks, and falsification tests help assess whether the causal story holds up. My one beef is that these tools still lean heavily on the assumption that the underlying theoretical model is correctly specified. In other words, causal inference helps stress-test assumptions, but it doesn’t always provide a clear way to judge whether one theoretical framework is more valid than another!

    • taneq 8 hours ago ago

      I’d say rather that “statistically significance” is a measure of surprise. It’s saying “If this default (the null hypothesis) is true, how surprised would I be to make these observations?”

      • kqr 7 hours ago ago

        Maybe you can think of it as saying "should I be surprised" but certainly not "how surprised should I be". The magnitude of the p-value is a function of sample size. It is not an odds ratio for updating your beliefs.

    • prasadjoglekar 9 hours ago ago

      For all the shit that HN gives to MBAs, one thing they instill into you during the Managerial Stats class is Stag Sig not the same as Managerial Sig.

  • nathan_compton 10 hours ago ago

    Really classic "rationalist" style writing: a soup of correct observations about statistical phenomena with chunks of weird political bullshit thrown in here and there. For example: "On a more contemporary note, these theoretical & empirical considerations also throw doubt on concerns about ‘algorithmic bias’ or inferences drawing on ‘protected classes’: not drawing on them may not be desirable, possible, or even meaningful."

    This is such a bizarre sentence. The way its tossed in, not explained in any way, not supported by references, etc. Like I guess the implication being made is something like "because there is a hidden latent variable that determines criminality and we can never escape from correlations with it, its ok to use "is_black" in our black box model which decides if someone is going to get parole? Ridiculous. Does this really "throw doubt" on whether we should care about this?

    The concerns about how models work are deeper than the statistical challenges of creating or interpreting them. For one thing, all the degrees of freedom we include in our model selection process allow us to construct models which do anything that we want. If we see a parole model which includes "likes_hiphop" as an explanatory variable we ought to ask ourselves who decided that should be there and whether there was an agenda at play beyond "producing the best model possible."

    These concerns about everything being correlated actually warrant much more careful understanding about the political ramifications of how and what we choose to model and based on which variables, because they tell us that in almost any non-trivial case a model is at least partly necessarily a political object almost certainly consciously or subconsciously decorated with some conception of how the world is or ought to be explained.

    • zahlman 10 hours ago ago

      > This is such a bizarre sentence. The way its tossed in, not explained in any way,

      It reads naturally in context and is explained by the foregoing text. For example, the phrase "these theoretical & empirical considerations" refers to theoretical and empirical considerations described above. The basic idea is that, because everything correlates with everything else, you can't just look at correlations and infer that they're more than incidental. The political implications are not at all "weird", and follow naturally. The author observes that social scientists build complex models and observe huge amounts of variables, which allows them to find correlations that support their hypothesis; but these correlations, exactly because they can be found everywhere, are not anywhere near as solid evidence as they are presented as being.

      > Like I guess the implication being made is something like "because there is a hidden latent variable that determines criminality and we can never escape from correlations with it, its ok to use "is_black" in our black box model which decides if someone is going to get parole?

      No, not at all. The implication is that we cannot conclude that the black box model actually has an "is_black" variable, even if it is observed to have disparate impact on black people.

      • nathan_compton 9 hours ago ago

        Sorry, but I don't think that is a reasonable read. The phrase "not drawing on them may not be desirable, possible, or even meaningful" is a political statement except perhaps for "possible," which is just a flat statement that its hard to separate causal variables from non-causal ones.

        Nothing in the statistical observation that variables tend to be correlated suggests we should somehow reject the moral perspective that that its desirable for a model to be based on causal rather than merely correlated variables, even if finding such variables is difficult or even, impossible to do perfectly. And its certainly also _meaningful_ to do so, even if there are statistical challenges. A model based on "socioeconomic status" has a totally different social meaning than one based on race, even if we cannot fully disentangle the two statistically. He is mixing up statistical and social, moral and even philosophical questions in a way which is, in my opinion, misleading.

        • jeremyjh 7 hours ago ago

          Or maybe your own announced bias against “rationalists” is affecting your reading of this. I agree with GPs interpretation.

        • naasking 6 hours ago ago

          > Nothing in the statistical observation that variables tend to be correlated suggests we should somehow reject the moral perspective that that its desirable for a model to be based on causal rather than merely correlated variables, even if finding such variables is difficult or even, impossible to do perfectly.

          Perfect is the enemy of good. That it would be desirable to construct a model based on causal variables is self-evident, but we don't have those, and if a correlative model can demonstrably improve people's material conditions, even if conditioned on variables you find "distasteful", what is your argument that such a model shouldn't be used?

          • nathan_compton 4 hours ago ago

            It really depends on a lot of things, frankly. For one thing, we, as a society, aren't optimizing for short term material conditions exclusively. The abstract dignity of not letting arbitrary variables determine important aspects of our lives might outweigh certain material benefits.

    • nxobject 38 minutes ago ago

      > For example: "On a more contemporary note, these theoretical & empirical considerations also throw doubt on concerns about ‘algorithmic bias’ or inferences drawing on ‘protected classes’: not drawing on them may not be desirable, possible, or even meaningful."

      As much as I do think that good, parsimonious social science modeling _requires_ theoretical commitments, the test is whether TFA would say the same thing about political cause du jour - say, `is_white` in hiring in an organization that does outreach to minority communities.

    • pcrh 8 hours ago ago

      "Rationalists" do seem to have a fetish for ranking people and groups of people. Oddly enough, they frequently use poorly performed studies and under-powered data to reach their conclusions about genetics and IQ especially.

    • ml-anon 8 hours ago ago

      Yes this is gwern to a "T". Overwhelm with a r/iamverysmart screed whilst insidiously inserting baseless speculation and opinion as fact as if the references provided cover those too. Weirdly the scaling/AI community loves him.

  • ricardobayes 4 hours ago ago

    Not commenting on the topic at hand, but my goodness, what a beautiful blog. That drop cap, the inline comments on the right hand side that appear on larger screens, the progress bar, chef's kiss. This is how a love project looks like.

  • senko 14 hours ago ago

    The article missed the chance to include the quote from that standard compendium of information and wisdom, The Hitchhiker's Guide to the Galaxy:

    > Since every piece of matter in the Universe is in some way affected by every other piece of matter in the Universe, it is in theory possible to extrapolate the whole of creation — every sun, every planet, their orbits, their composition and their economic and social history from, say, one small piece of fairy cake.

    • sayamqazi 14 hours ago ago

      Wouldnt you need the T_zero configuration of the universe for this to work?

      Given different T_zero configs of matter and energies T_current would be different. and there are many pathways that could lead to same physical configuration (position + energies etc) with different (Universe minus cake) configurations.

      Also we are assuming there is no non-deterministic processed happening at all.

      • jerf 2 hours ago ago

        The real problem is you need a real-number-valued universe for this to work, where the measurer needs access to the full real values [1]. In our universe, which has a Planck size and Planck time and related limits, the statement is simply untrue. Even if you knew every last detail about a piece of fairy cake, whatever "every last detail" may actually be, and even if the universe is for some reason deterministic, you still could not derive the entire rest of the universe from it correctly. Some sort of perfect intelligence with access to massive amounts of computation may be able to derive a great deal more than you realize, especially about the environment in the vicinity of the cake, but it couldn't derive the entire universe.

        [1]: Arguments are ongoing about whether the universe has "real" numbers (in the mathematical sense) or not. However it is undeniable the Planck constants still provide a practical barrier to any hypothetical real valued numbers in the universe that make them in practice inaccessible.

      • senko 13 hours ago ago

        I am assuming integrating over all possible configurations would be a component of The Total Perspective Vortex.

        After all, Feynman showed this is in principle possible, even with local nondeterminism.

        (this being a text medium with a high probability of another commenter misunderstanding my intent, I must end this with a note that I am, of course, BSing :)

      • eru 9 hours ago ago

        > Wouldnt you need the T_zero configuration of the universe for this to work?

        Why? We learn about the past by looking at the present all the time. We also learn about the future by looking at the present.

        > Also we are assuming there is no non-deterministic processed happening at all.

        Depends on the kind of non-determinism. If there's randomness, you 'just' deal with probability distributions instead. Since you have measurement error anyway, you need to do that anyway.

        There are other forms of non-determinism, of course.

        • psychoslave 7 hours ago ago

          > We learn about the past by looking at the present all the time. We also learn about the future by looking at the present.

          We infer about the past, based a bit on some material evidence we can subjectively partially get some acquaintance with. Through thick cultural biases. And the actual material suggestions should not come to far from our already integrated internal narrative, without what we will ignore it or actively fight it.

          Future is pure fantasm, only bound by our imagination and what we take for unchallengeable fundamentals of what the world allows according to our inner model of it.

          At least, that's one possible interpretation of the thoughts when an attention focus on present.

    • prox 11 hours ago ago

      In Buddhism we have dependent origination : https://en.wikipedia.org/wiki/Prat%C4%ABtyasamutp%C4%81da

      • lioeters 10 hours ago ago

        Also the concept of implicate order, proposed by the theoretical physicist David Bohm.

        > Bohm employed the hologram as a means of characterising implicate order, noting that each region of a photographic plate in which a hologram is observable contains within it the whole three-dimensional image, which can be viewed from a range of perspectives.

        > That is, each region contains a whole and undivided image.

        > "There is the germ of a new notion of order here. This order is not to be understood solely in terms of a regular arrangement of objects (e.g., in rows) or as a regular arrangement of events (e.g., in a series). Rather, a total order is contained, in some implicit sense, in each region of space and time."

        > "Now, the word 'implicit' is based on the verb 'to implicate'. This means 'to fold inward' ... so we may be led to explore the notion that in some sense each region contains a total structure 'enfolded' within it."

    • euroderf 6 hours ago ago

      Particles do not suffer from predestination, do they ?

  • apples_oranges 13 hours ago ago

    People didn't always use statistics to discover truths about the world.

    This, once developed, just happened to be a useful method. But given the abuse using those methods, and the proliferation of stupidity disguised as intelligence, it's always fitting to question it, and this time with this correlation noise observation.

    Logic, fundamental knowledge about domains, you need that first. Just counting things without understanding them in at least one or two other ways, is a tempting invitation for misleading conclusions.

    • kqr 7 hours ago ago

      > People didn't always use statistics to discover truths about the world.

      And they were much, much worse off for it. Logic does not let you learn anything new. All logic allows you to do is restate what you already know. Fundamental knowledge comes from experience or experiments, which need to be interpreted through a statistical lens because observations are never perfect.

      Before statistics, our alternatives for understanding the world was (a) rich people sitting down and thinking deeply about how things could be, (b) charismatic people standing up and giving sermons on how they would like things to be, or (c) clever people guessing things right every now and then.

      With statistics, we have to a large degree mechanised the process of learning how the world works, and anyone sensible can participate, and they can know with reasonable certainty whether they are right or wrong. It was impossible to prove a philosopher or a clergyman wrong!

      That said, I think I agree with your overall point. One of the strengths of statistical reasoning is what's sometimes called intercomparison, the fact that we can draw conclusions from differences between processes without understanding anything about those processes. This is also a weakness because it makes it easy to accidentally or intentionally manipulate results.

      • aeonik 5 hours ago ago

        Discovering that two different-seeming statements reduce to the same truth is new knowledge.

    • mnky9800n 9 hours ago ago

      There is a quote from George Lucas where he talks about how when new things come into a society people have a tend to over do it.

      https://www.youtube.com/watch?v=VEIrQUXm_hY

  • derbOac 6 hours ago ago

    Arguments like this have been around for decades. I think it's important to keep in mind — critical even.

    At the same time, as I've been forced to wrestle with it more in my work, I've increasingly felt that it's sort of empty and unhelpful. "Crud" does happen in patterns, like a kind of statistical cosmic background radiation — it's not meaningless. Sometimes it's important to understand it, and treating it as such gets no one anywhere. Sometimes the associations are difficult to explain easily when you try to pick it apart, and other times I think they're key to understanding uncontrolled confounds that should be controlled for.

    As much as this background association is present too, it's not always there. Sometimes things do have zero association.

    Also, trying to come up with a "meaningful" effect size that's not zero is pretty arbitrary and subjective.

    There's probably more productive ways of framing the phenomenon.

  • scoofy 2 hours ago ago

    A really interesting essay. I'm genuinely surprised the author didn't delve more into philosophy here. I think the biggest issue this essay illustrates is the inductive-deductive jump that modeling (especially mathematical modeling makes). That is, if the axioms we are using are correct, then the output of the model should reflect reality. This is very obviously not always the case, and because of the black swan problem (in a very real sense) we can never really know if it does.

    Any time we wade into solipsism, however, I think it's important to remember that statistical analysis is a tool, not an arbiter of Truth. We are trying to exist in the mess of a world we live in, and we're going to be using every possible advanced tool we have in our arsenal to do that, and the standard model of science is probably the best tool we have. At the same time, we should often return to that solipsism an remember that where we can improve the model, we should.

  • doommood 29 minutes ago ago

    Does the author account for time autocorrelation? Pearson’s correlation assumes independence between observations, yet time series data often violate this assumption. Without adjusting for autocorrelation, correlations may be artificially inflated.

  • Evidlo 14 hours ago ago

    This is such a massive article. I wish I had the ability to grind out treatises like that. Looking at other content on the guy's website, he must be like a machine.

    • kqr 12 hours ago ago

      IIRC Gwern lives extremely frugally somewhere remote and is thus able to spend a lot of time on private research.

      • tux3 12 hours ago ago

        IIRC people funded moving gwern to the bay not too long ago.

      • lazyasciiart 7 hours ago ago

        That and early bitcoin adoption. There’s a short bio somewhere on the site.

    • pas 12 hours ago ago

      lots of time, many iterations, affinity for the hard questions, some expertise in research (and Haskell). oh, and also it helps if someone is funding your little endeavor :)

    • aswegs8 10 hours ago ago

      I wish I would be even able to read things like that.

    • tmulc18 13 hours ago ago

      gwern is goated

  • bsoles 3 hours ago ago

    Any two linearly increasing or decreasing datasets will be necessarily highly correlated. But that doesn't mean anything unless you also have a plausible explanation why it is so.

    Also correlation is often taken to mean linear correlation. For example, you can have two nonlinearly varying dataset to be perfectly (rank) correlated while their linear correlation can be close to zero.

    People often attach undue meaning to correlation.

  • pcrh 8 hours ago ago

    This is why experimental science is different from observational studies.

    Statistical analyses provide a reason to believe one hypothesis over another, but any scientist will extend that with an experimental approach.

    Most of the examples given in this blog post refer to medical, sociological or behavioral studies, where properly controlled experiments are hard to perform, and as such are frequently under-powered to reveal true cause-effect associations.

  • dang 14 hours ago ago

    Correlated. Others?

    Everything Is Correlated - https://news.ycombinator.com/item?id=19797844 - May 2019 (53 comments)

  • alexpotato 4 hours ago ago

    I was recently looking at a large timeseries dataset.

    I noticed when doing a scatter plot of two variables and noticed that there were several "lines" of dots.

    This generally implies that subsets of the two variables may have correlations or there is a third variable to be added.

    I did some additional research and it is possible for two variables with large N to show correlation for short bursts even if both variables are random.

    I mention for two reasons:

    1. I was just doing the above and saw the OP article today

    2. Despite taking multiple college level stats classes, I don't remember this ever being mentioned.

  • st-keller 13 hours ago ago

    „This renders the meaning of significance-testing unclear; it is calculating precisely the odds of the data under scenarios known a priori to be false.“

    I cannot see the problem in that. To get to meaningful results we often calculate with simplyfied models - which are known to be false in a strict sense. We use Newtons laws - we analyze electric networks based on simplifications - a bank-year used to be 360 days! Works well.

    What did i miss?

    • bjornsing 12 hours ago ago

      The problem is basically that you can always buy a significant result with money (large enough N always leads to ”significant” result). That’s a serious issue if you see research as pursuit of truth.

      • syntacticsalt 11 hours ago ago

        Reporting effect size mitigates this problem. If observed effect size is too small, its statistical significance isn't viewed as meaningful.

        • bjornsing 9 hours ago ago

          Sure (and of course). But did you see the effect size histogram in the OP?

    • thyristan 12 hours ago ago

      There is a known maximum error introduced by those simplifications. Put the other way around, Einstein is a refinement of Newton. Special relativity converges towards Newtonian motion for low speeds.

      You didn't really miss anything. The article is incomplete, and wrongly suggests that something like "false" even exists in statistics. But really something is only false "with a x% probability of it actually being true nonetheless". Meaning that you have to "statistic harder" if you want to get x down. Usually the best way to do that is to increase the number of tries/samples N. What the article gets completely wrong is that for sufficiently large N, you don't have to care anymore, and might as well use false/true as absolutes, because you pass the threshold of "will happen once within the lifetime of a bazillion universes" or something.

      Problem is, of course, that lots and lots of statistics are done with a low N. Social sciences, medicine, and economy are necessarily always in the very-low-N range, and therefore always have problematic statistics. And try to "statistic harder" without being able to increase N, thereby just massaging their numbers enough to get a desired conclusion proved. Or just increase N a little, claiming to have escaped the low-N-problem.

      • syntacticsalt 11 hours ago ago

        A frequentist interpretation of inference assumes parameters have fixed, but unknown values. In this paradigm, it is sensible to speak of the statement "this parameter's value is zero" as either true or false.

        I do not think it is accurate to portray the author as someone who does not understand asymptotic statistics.

        • thyristan 6 hours ago ago

          > it is sensible to speak of the statement "this parameter's value is zero" as either true or false.

          Nope. The correct way is rather something like "the measurements/polls/statistics x ± ε are consistent with this parameter's true value to be zero", where x is your measured value and ε is some measurement error, accuracy or statistical deviation. x will never really be zero, but zero can be within an interval [x - ε; x + ε].

          • syntacticsalt 3 hours ago ago

            As you yourself point out, a consistent estimator of a parameter converges to that parameter's value in the infinite sample limit. That limit is zero or it's not.

    • PeterStuer 11 hours ago ago

      Back when I wrote a loan repayment calculator, there were 47 common different ways to 'day count' (used in calculating payments for incomplete repayment periods, e.g in monthly payments, what is the 1st-13th of aug 2025 as a fraction of aug 2025?).

    • whyever 13 hours ago ago

      It's a quantitative problem. How big is the error introduced by the simplification?

  • psychoslave 8 hours ago ago

    Looks like an impressive thorough piece of investigation. Well done.

    That said, holistic supposition can certainly be traced back as far as writting dawns. Here the focus on more modern/contemporary era is legitimate to keep the focus delimited on a more specific concern, but is a bit obfuscating this fact. Maybe it's already acknowledged in the document, I read it all yet.

  • justonceokay 8 hours ago ago

    Statistical correlations anre important to establish but there are the easiest and least useful part of the research. Creating theories as to “why” and “how” these correlations exist are what advances our knowledge.

    I read lot of papers that painstakingly show a correlation in the data, but then their theory about the correlation is a complete non sequitur.

  • 2rsf 13 hours ago ago
    • ezomode 11 hours ago ago

      Who should quote who? The article is from 2014.

      • 2rsf 6 hours ago ago

        Actually tylervigen's first web.archive capture is from May 9th 2014, but you are right I used the wrong word

  • eisvogel 14 hours ago ago

    It's just as I suspected - there are NO coincidences.

  • syntacticsalt 13 hours ago ago

    I don't disagree with the title, but I'm left wondering what they want us to do about it beyond hinting at causal inference. I'd also be curious what the author thinks of minimum effect sizes (re: Implication 1) and noninferiority testing (re: Implication 2).

  • petters 14 hours ago ago

    If two things e.g. both change over time, they will be correlated. I think it can be good to keep this article in mind

    • eru 9 hours ago ago

      > If two things e.g. both change over time, they will be correlated.

      No?

      You can have two independent random walks. Eg flip a coin, gain a dollar or lose a dollar. Do that to times in parallel. Your two account balances will change over time, but they won't be correlated.

  • endymion-light 7 hours ago ago

    the rest of the page has amazing design, but there's just something about the graphs switching from dark to light that flashbangs my eyes really badly - i think it's the sudden light!

  • hshshshshsh 11 hours ago ago

    Doesn't everything means all things that exist in universe and since they exist in same universe they are correlated?

    • terminalbraid 7 hours ago ago

      Causality from relativity prevents this from being generally true. There could be things in this universe sufficiently far away that we cannot see because whatever interaction we could observe from it has not reached here yet.

      • frotaur 6 hours ago ago

        It is actually inflation that precludes this from being true (along with relativity, of course). If there wasn't a period where spacetime inflated extremely rapidly, given two points, no matter how far apart, their causal pasts would eventually intersect, if going sufficiently back in the past. As such, they could correlate through a common cause C, which lies in both of their causal pasts.

  • andsoitis 14 hours ago ago

    there is but a single unfolding, and everything is part of it

  • cluckindan 11 hours ago ago

    I wonder if this tendency to correlate truly holds for everything? Intuitively it more or less demonstrates that nature tends to favor zero-sum games. Maybe analyzing correlations within the domain of theoretical physics would highlight true non-correlations in some particular approaches? (pun only slightly intended)

    • eru 9 hours ago ago

      > Intuitively it more or less demonstrates that nature tends to favor zero-sum games.

      Please explain.

      • cluckindan 8 hours ago ago

        For every action, there is an opposite and equal reaction. For example, there is a correlation between the acceleration and deceleration of colliding objects: inertia is transferred, not created or destroyed.

        Similarly, for every chemical and nuclear reaction, when something is gained, something else is lost. For example, when two ions bond covalently by sharing electrons, a new molecule is gained, but the two ions are no longer what they previously were. So there is a correlation between gain of reaction products and loss of reactants.

        But perhaps such analogies cannot be found everywhere in theoretical physics. Perhaps such a non-correlation would be a sign of a novel discovery, or a sign that a theory is physically invalid. It could be a signal of something for sure.

        • terminalbraid 6 hours ago ago

          How do I reconcile this with "entropy invariably increases" which is a contradiction to your hypothesis that "nature tends to favor zero-sum games"?

          How do I reconcile "for every chemical and nuclear reaction, when something is gained, something else is lost" with catalysts increasing rate but not being consumed themselves?

          In fact you can show there are an uncountably infinite number of broken symmetries in nature, so it is mathematically possible to concoct a parallel number of cases where nature does not have some "zero sum game" by Noether's Theorem.

          Your statement is just cherry picking a few and then (uncountably infinitely) overgeneralizing.

          • cluckindan 4 hours ago ago

            Entropy is a measurement of the system itself and doesn’t describe the dynamics within that system.

            Catalysts increase reaction rate just as a train runs faster on a track. Is a railway a catalyst?

            Are symmetries broken in nature or just models of nature? Or are you referring to accepted theories in theoretical physics, which was the entire point here?

  • nnnnico 9 hours ago ago

    There is no zero in the real world

  • 01HNNWZ0MV43FF 12 hours ago ago

    > For example, while looking at biometric samples with up to thousands of observations, Karl Pearson declared that a result departing by more than 3 standard deviations is “definitely significant.”

    Wait. Sir Arthur Conan Doyle lived at basically the exact same time as this Karl Pearson.

    Is that why the Sherlock Holmes stories had handwriting analysis so frequently? Was there just pop science going around at the time that like, let's find correlations between anything and anything, and we can see that a criminal mastermind like Moriarty would certainly cross their T's this way and not that way?

  • jongjong 12 hours ago ago

    Also, I'm convinced that the reason humans intuitively struggle to figure out causality is because the vast majority of causes and effects are self-reinforcing cycles and go both ways. There was little evolutionary pressure for us to understand the concept of causality because it doesn't play a strong role in natural selection.

    For example, eat a lot and you will gain weight, gain weight and you will feel more hungry and will likely eat more.

    Or exercise more and it becomes easier to exercise.

    Earning money becomes easier as you have more money.

    Public speaking becomes easier as you do it more and the more you do it, the easier it becomes.

    Etc...

    • ctenb 12 hours ago ago

      > Public speaking becomes easier as you do it more and the more you do it, the easier it becomes.

      That's saying the same thing twice :)

      • jongjong 9 hours ago ago

        Haha yes. I meant to say the more public speaking you do, the easier it gets so the more often you want to do it.

    • arduanika 6 hours ago ago

      Or, exercise more and you will eat more. Drat.

    • renox 10 hours ago ago

      > Or exercise more and it becomes easier to exercise.

      Only if you don't injure yourself while exercising.

      • jongjong 9 hours ago ago

        A lot of things can happen to break a self-perpetuating cycle. But it's usually some extreme event. The cycle keeps optimizing in a particular direction and this eventually leads to an extreme situation which becomes unsustainable. The natural equilibrium of a self-reinforcing cycle is not static but drifting towards some extreme unstable state. There is usually a point where it breaks down.

        But I suspect that being able to figure out causation doesn't matter much from a survival or reproduction perspective because cause and effect are just labels.

        Reality in a self-perpetuating cycle is probably like Condition A is 70% responsible and Condition B is 30% responsible for a problem but they feedback and exacerbate each other... You could argue that Condition A is the cause and Condition B is the effect because B < A but that's not quite right IMO. Also, it's not quite right to say that because A happened first, that A is the cause of a severe problem... The problem would never have gotten so bad to such extent without feedback from B.