GPT-5 Hot Take

(garymarcus.substack.com)

39 points | by almost-exactly a day ago ago

9 comments

  • LeftHandPath a day ago ago

    > Pricing is good, but profits may continue to be elusive; still no clear technical moat.

    It's quite ironic that the technology that is displacing so many people from so many industries, has yet to make a profit. I fear the "creative" part of their destruction will take longer to achieve than they advertise.

    • belter a day ago ago

      Nobody has lost their job yet because of AI. But lots of people lost their job, because of the money their CEOs spent on AI.

  • darth_avocado a day ago ago

    What is worse? Terrible charts, terrible charts making it through any form of scrutiny or terrible charts intentionally making it to the main stage.

  • abeppu a day ago ago

    > OpenAI conveniently forgot to include this comparison (ARC-AGI-2) in their livestream recital of benchmark progress, which left the livestream looking like marketing rather than science.

    Yeah but it was _supposed_ to be marketing, right? Like, of course a product video isn't science in the same way a "hot take" post also isn't science.

  • dude250711 a day ago ago

    Why is Grok so surprisingly decent? Does lack of mainstream liberal-left censorship (replaced with Musky censorship) result in some sort of a weird performance boost?

    • pupppet a day ago ago

      Is it decent, or does it game the tests? Really, would love to know..

    • hagbard_c a day ago ago

      There's nothing weird about a model performing better when it is built to more closely relate to reality instead of an ideologically tainted version of such. I don't know how much Musk & Co. interfere with the fine tuning of the models but it is clear that this interference is far less heavy-handed than what the other actors do to their models.

    • slowmovintarget a day ago ago

      Yes, actually.

      Fewer fingers on the scale means the LLM gets to actually do its thing. GPT-4 with zero filtering was scary smart according to the red teams that were testing it. The version the public got had a lobe tied behind it's back.

      Having only Grok 3 to compare, and toying around with GPT-5... GPT-5 is pretty good.