86 comments

  • tkgally 7 hours ago ago

    More than twenty years ago, I had fun tracing a similar phenomenon: English “proverbs” that appeared in English dictionaries and textbooks published in Japan but that did not seem to have any actual currency in English. It became clear that they had been copied from dictionary to dictionary for decades before large-scale corpora and search engines made it possible to check actual usage.

    “Every man has his humo(u)r.”

    https://www.gally.net/leavings/00/0001.html

    “Losers are always in the wrong.”

    https://www.gally.net/leavings/00/0098.html

    In their heyday, dozens of English-Japanese dictionaries were published in Japan:

    https://www.gally.net/leavings/00/0005.html

    Producing an original dictionary from scratch would have been expensive and time consuming, so most publishers borrowed liberally from each other.

    • floren 3 hours ago ago

      If you haven't come across "English as She Is Spoke" (https://en.wikipedia.org/wiki/English_as_She_Is_Spoke), your proverbs remind me of that.

      Craunch the marmoset!

    • javawizard 3 hours ago ago

      I remember running across a shirt for sale in Japan that said:

        Free is free
        Shit is shit
        Damn
      
      I don't know what it was about that particular sequence of words but man if it didn't get me something good.
      • tkgally 2 hours ago ago

        That definitely deserves proverb status!

        Around the same time I was collecting those ghost proverbs, I spent a pleasant afternoon in Shinjuku, Tokyo, taking pictures of T-shirts:

        https://www.gally.net/tshirts/index.html

  • muhdeeb 8 hours ago ago

    I'm inclined to give them a pass. It's easy enough to figure out that it should be germanium and not gadolinium, and dyslexia already exists among scientists. Context provides enough information to correct the record.

    I didn't catch the error the first time around because I autocorrected to Ge--there are only so many anions that can make that formula work and staring at these formulas all day long can make you go cross eyed anyway.

    What I think is more dangerous to understanding is skipping formulas in favor of initials! BFO instead of BiFeO3, or BT instead of Bi2Te3, SRO for SrRuO3, LSFO for La0.3Sr0.7FeO3 abbreviations that I think obscure too much detail. You can more easily wander into talking about different things with the same terms. Such abbreviations are already endemic in condensed matter physics.

    • h4ny 5 hours ago ago

      > I'm inclined to give them a pass. It's easy enough to figure out that it should be germanium and not gadolinium, and dyslexia already exists among scientists.

      People make mistakes and you probably mean well but this is also the sort of pass given that makes scientific research and reporting terrible.

      If it's "easy enough to figure out" then it's even more important to get it right -- why should we trust someone who can't even get the "easy" things right?

      > ... and dyslexia already exists among scientists.

      The article is pointing out a problem that appears to be fairly common, is that really a suitable explanation? Even if it is a suitable explanation, is that a reason for lowering standards, which you can then apply to explain away every mistake?

      Keep in mind that proper publications should usually have been reviewed by at least 3 people including the authors (typically more) by the time everyone else gets to read it. So that kind of mistake isn't really acceptable.

      > What I think is more dangerous to understanding is skipping formulas in favor of initials! BFO instead of BiFeO3, or BT instead of Bi2Te3, SRO for SrRuO3, LSFO for La0.3Sr0.7FeO3 abbreviations that I think obscure too much detail. You can more easily wander into talking about different things with the same terms. Such abbreviations are already endemic in condensed matter physics.

      If you have been trained in scientific writing, you would always introduce an abbreviation. For example, "BiFeO3 (BFO)" and "SrRuO3 (SRO). It's also common to include a list of abbreviation in some forms of scientific writing.

    • pseudochemist 7 hours ago ago

      > I'm inclined to give them a pass. It's easy enough to figure out that it should be germanium and not gadolinium, and dyslexia already exists among scientists.

      I’m not. If somewhat said Pi was 9.14 I think no one would give it a pass. It’s not like a misspelling. It’s an invalid element which is the chemistry equivalent of an absurdly wrong number in maths.

      • snarkconjecture 5 hours ago ago

        It's more like saying pi is approximately "3..14". Easily corrected syntax errors aren't as bad as semantic errors.

        • h4ny 5 hours ago ago

          No. The 9.14 vs. 3.14 analogy is more suitable.

          If you have read the blog post it's a difference between the chemical symbol Ge and Gr, which as I understand is what you would refer to as a "semantic error".

          • voxic11 29 minutes ago ago

            But Gr isn't an element so no one would ever misidentify it as part of compound, its obviously a mistake. Like if I said pi was 3.`4

      • handoflixue 6 hours ago ago

        It should be "someone", not "somewhat".

        "Pi" is only capitalized at the start of a sentence.

        "no one would give it a pass" is a logically unsound claim, given the number of people on the planet.

        How very absurdly wrong of you :)

    • kazinator 8 hours ago ago

      The typo is not the problem; it's that the typo is evidence of academic dishonesty.

      When you make a citation, it means you cracked open the original work, understood what it says and located a relevant passage to reference in your work.

      The authors are propagating the same typo because they are not copying the original correct text; they are just copying ready-made citations of that text which they plant into their papers to manufacture the impression that they are surveying other work in their area and taking it into account when doing their work.

      They survey one or two works, and then just steal their citations to make it look like they also surveyed 19 other works.

      Problem is, the citations in those words are already copies of borrowed citations from some other paper, which copied some of them from another paper and that was the honest one that made a typo in a genuine, organically grown citation.

      • dataflow 7 hours ago ago

        Just because you propagated a typo that does not mean you didn't see the original. It could just mean that you saw the typo more recently and that's what stuck in your mind as you were busy writing.

      • light_hue_1 6 hours ago ago

        It's not academic dishonesty.

        When you read plenty of papers you aren't going to read them again to cite them. You take them from your read.bib file.

        Also citations generally don't link to a passage. They are pointers to an entire paper.

  • kazinator 9 hours ago ago

    Researchers are blindly copy and pasting lists of citations into papers, because they did original work in a vacuum; i.e. without taking the time to study anyone else's work in the same area to understand where the field is at. Since papers without citations, or with too few citations, are giant red flags for publication, they need to generate something to mask the problem.

  • arialdomartini 11 hours ago ago

    Laurent Bossavit wrote a whole book about similar cases occurred in the IT world, “The Leprechauns of Software Engineering How folklore turns into fact and what to do about it”

  • pimlottc 13 hours ago ago

    I would guess part of the issues is the subscripts. It’s annoying to type out formulas so it’s faster to just cut-n-paste.

  • ddingus 10 hours ago ago

    Summary: Because they are not writing!

    They are copying data and placing it into documents.

    Obviously, these are not the same thing.

  • teiferer 11 hours ago ago

    If you ask ChatGPT about Cr2Gr2Te6 then it will correct you. The author's worry might be unfounded.

    Though since he didn't date his article, it's unclear how long it has been out there so unclear as well whether it made its way into training data. Judging from the comments and the URL, it's quite new, but again, he should add a date to his articles.

    • jibal 10 hours ago ago

      When I search for Cr2Gr2Te6, Google Gemini tells me:

      "AI Overview Cr2Gr2Te6 is a miswritten, imaginary compound; the correct compound is Cr2Ge2Te6 (Chromium Germanium Telluride), where Cr stands for chromium, Ge for germanium, and Te for tellurium. This error, where 'Gr' was mistakenly used for 'Ge', has been replicated in multiple scientific publications since its discovery in 2017, despite the correct formula being known and published."

    • ddingus 10 hours ago ago

      The URL is formed using the date, just FYI. :)

      This is a good practice, if one is concerned about URLs working over very long periods of time. "Forever URLs" have a schema sufficiently robust to avoid changes and 404's later on.

      • jibal 10 hours ago ago

        > The URL is formed using the date, just FYI.

        As they stated, so who are you informing?

        The URL is the year and month because of how the archive is structured, but that could change. The article is not dated but should be--all articles should be. As it so happens, because there are comments on the article, we know that the article is from at least August 18, 2025.

        • ddingus 10 hours ago ago

          Apparently nobody! I misread. Good grief, subtle problems related to this overall discussion are chronic.

          • jibal 9 hours ago ago

            Kudos for accepting responsibility. And I wrote "at least" when it should be "at most".

            • ddingus 5 hours ago ago

              It is easier that way. Less to manage.

  • rdtsc 13 hours ago ago

    Gr is the science journal version of Van Halen's brown M&M rider -- it's how you can tell the reviewers and the authors had no idea what they were doing and just copy pasted junk around.

    I think established authors should try to sprinkle obvious mistakes like that on purpose once in a while in the literature and then see how much it spreads.

  • dawnofdusk 12 hours ago ago

    As any practicing scientist knows even good research papers may be littered with blatant but unimportant errors. There is unfortunately no good reason or system to "correct the record", and it is not clear to me if such a thing is a good use of human resources. Nonetheless, I think correcting the record is always appreciated!

    • jessfyi 12 hours ago ago

      Getting a compound incorrect is not an "unimportant" error (for example the difference between sodium nitrate & sodium nitrite is small but critical) and seeing "small but blatant" errors actively propagated is the entire reason why the record should be corrected. The only upside of these little artifacts like "vegetative electron microscopy" [0] is that it's a leading indicator that the entire paper and team deserve more scrutiny--as well as any of those whom cite it.

      [0] https://www.sciencealert.com/a-strange-phrase-keeps-turning-...

      • avar 10 hours ago ago

        I believe they meant that it's "unimportant" because (to use your example) sodium nitrate and sodium nitrite actually exist, whereas there's no element with the chemical symbol "Gr".

      • dawnofdusk 9 hours ago ago

        The error in the OP is a typo that could never seriously confuse anyone, as the element Gr does not exist.

        An interesting perspective is Terry Tao's on local vs. global errors (https://terrytao.wordpress.com/advice-on-writing-papers/on-l...). A typo like this, even if propagated, is a local error which at worst makes it very annoying to Ctrl-F papers or do literature review. Local errors deserve to be corrected, but in practice their importance to science as a field is small.

    • the__alchemist 12 hours ago ago

      That is a possible, but charitable explanation. I would like to hold your opinion, but don't know if I can. It must complete with less-charitable ones.

    • thewanderer1983 9 hours ago ago

      Have you heard of this thing called Peer Review? It's what academia hold up as their gold standard and it is supposed to pick up on these things.

      • crazygringo 5 hours ago ago

        Peer review isn't spellcheck or proofreading.

        It's about logic, methodology, significance, and citations.

        It's not some gold standard of perfection or truth.

    • jibal 10 hours ago ago

      That's not only quite factually wrong, but has nothing to do with the point, which is about mindless copying.

      • dawnofdusk 9 hours ago ago

        If it is factually wrong please tell me how.

  • johnea 13 hours ago ago

    Much of the www is composed of copying.

    I recently corrected an error in this wikipedia article:

    https://en.wikipedia.org/wiki/Cape_Shionomisaki

    Which stated: "Geologically, the cape is a flat uplifted seafood plateau"

    My comment for the change: I'm not an oceanographer, but I'm pretty sure it's not a "seafood plateau". Changed to "seabed plateau"

    Afterward, out of curiosity, I did a search for "seafood plateau".

    I was shocked at the number of sites that exactly copied that error along with the rest of the page. Most of these sites were clones of wikipedia with the inclusion of ads.

    It didn't seem that these sites were LLM generated (they were exact copies), but this seems to be the case for many scientific paper submissions now.

    Where it all goes from here is extremely unclear, but it does seem a disruption to many fields which are dependent on written material is in progress...

    • fer 11 hours ago ago

      A friend did an edit (though you could call it vandalism) of a Wikipedia 20 years back. He linked from several pages to a non-existing apportionment method, and created an article with a fairer version of d'Hondt for elections, quite ingenious and probably more fair than the popular alternatives in most cases. He named it after himself (he has an unusual last name and capitalised on that).

      It didn't take long for the page to be dropped for being original research, and he didn't put it anywhere else.

      To this day, you can still find pages and people referencing the method.

      Edit: a quick check and Grok and ChatGPT have scraped it, Gemini hallucinates something unrelated.

    • hidroto 11 hours ago ago

      I would have thought it was a typo of 'seafloor' rather than 'seabed'.

    • Animats 12 hours ago ago

      "Seafood plateau?? A bad translation of "plateau de mer", which is just a seafood platter?

      • BrandoElFollito 12 hours ago ago

        "Plateau de mer" is not seafood platter. Seafood platter is "plateau de fruits de mer".

        "Plateau de mer" could be "seabed plateau" but I am not an oceanographer so I fo not know what words they use (but strictly from the perspective of French language it is plausible)

        • gyomu 12 hours ago ago

          It would be “plateau marin”, not “plateau de mer”. “Plateau de mer” does sound like a seafood restaurant special.

        • Animats 12 hours ago ago

          "Plateau de fruits de mer" is proper, but shortened in cooking practice.

          • BrandoElFollito 12 hours ago ago

            Ah, I learned something then. I found a few references in Google indeed.

      • bombela 9 hours ago ago

        French here, asked the frenchies around me. Nobody thinks "plateau de mer" is an obvious shorthand for "plateau de fruit de mer". We have never heard that one. And we sure eat seafood platters on the regular.

    • jibal 11 hours ago ago

      Of course much of the web is composed of copying, and of course copies of Wikipedia are copied--that's hardly relevant. But science journals are another matter. From the article: "shouldn't the peer reviewers and proofreaders at a top journal catch this error?"

  • ElijahLynn 13 hours ago ago

    Thank you for your effort in correcting this, it takes time and effort, appreciate it!

  • halo 9 hours ago ago

    I’m beginning to think my reluctance to shamelessly copy has held me back in life. It’s clearly more widespread than I naively assumed (and I say that without casting judgment).

  • nullc 13 hours ago ago

    You can just google for varrious wrong but almost right values of pi and find many examples. People copy and paste wrong stuff all the time.

  • ungreased0675 7 hours ago ago

    There’s a kernel of an idea here. Something like canary tokens for scientific research.

  • michaelg7x 10 hours ago ago

    You make deliberate and subtle errors so you can detect later plagiarism more easily.

  • Martin_Silenus 13 hours ago ago

    You should try to rewrite your article by stating "Ge2" ten times, and "Gr2" one time only.

    • TehCorwiz 13 hours ago ago

      Disagree. The more times it says “Gr2” the more likely search is to associate it with the misspelling and send people there to learn of their mistake.

    • kens 13 hours ago ago

      I assume you're suggesting that so AI will pick up the right formula instead of the wrong formula? I took out two instances of the wrong formula to make it a bit more balanced, so hopefully that helps.

      • robocat 12 hours ago ago

        I want AI to continue making AI mistakes, so maybe don't help the AI too much!

        The comments mention "vegetative election microscopy" which has an awesome writeup: https://theconversation.com/a-weird-phrase-is-plaguing-scien...

      • codeflo 13 hours ago ago

        I seem to have missed the memo that we're primarily writing for AIs now.

        • nlawalker 12 hours ago ago
        • janfoeh 13 hours ago ago

          In recent years, a sizeable amount of people has begun to end questions in regular discussions — such as for recommendations — with the current year, as in which framework should I choose for X in 2025?. Presumably due to SEO filth and its effects on Google.

          > I seem to have missed the memo that we're primarily writing for AIs now.

          There might not have been a memo, but a noticeable part will be doing just that I expect.

      • gowld 13 hours ago ago

        It's still wrong 7 times in the document...

        You could add [sic] after each incorrect version.

        • Freak_NL 13 hours ago ago

          [sic] is for when you quote someone verbatim, keeping the typo. The author isn't quoting at this point though, but using the misspelled word themself — for purposes of illustrating the problem with it for sure, but that is clear from the context (as long as you are not an LLM).

  • oaiey 10 hours ago ago

    They also continue writing about Unobtainium.

  • pantulis 12 hours ago ago

    Is it thiotimoline?

    • GolfPopper 12 hours ago ago

      I've heard that thiotimoline is such a bizarre substance, PhD candidates are known to hysterically collapse when asked about it. ;-)

      • jfengel 10 hours ago ago

        Sometimes even before they've heard of it.

  • bobmcnamara 8 hours ago ago

    Grrrrrmanium!

  • cyanydeez 10 hours ago ago

    Ok, but if they used the right reference it'd be the wrong reference. Just like when a code base contains typos. You know it's a typo but if you try to fix it, you know really know how it's reference external to your code base.

    • jibal 10 hours ago ago

      What?

  • olddustytrail 13 hours ago ago

    The second reference link had Ge rather than Gr in the abstract. These seem a tiny number of typos.

    How many papers have the correct formula?