GPT-5 Hot Take

(garymarcus.substack.com)

39 points | by almost-exactly a day ago ago

9 comments

LeftHandPath a day ago ago
> Pricing is good, but profits may continue to be elusive; still no clear technical moat.
It's quite ironic that the technology that is displacing so many people from so many industries, has yet to make a profit. I fear the "creative" part of their destruction will take longer to achieve than they advertise.
[-]
- belter a day ago ago
  Nobody has lost their job yet because of AI. But lots of people lost their job, because of the money their CEOs spent on AI.
darth_avocado a day ago ago
What is worse? Terrible charts, terrible charts making it through any form of scrutiny or terrible charts intentionally making it to the main stage.
abeppu a day ago ago
> OpenAI conveniently forgot to include this comparison (ARC-AGI-2) in their livestream recital of benchmark progress, which left the livestream looking like marketing rather than science.
Yeah but it was _supposed_ to be marketing, right? Like, of course a product video isn't science in the same way a "hot take" post also isn't science.
dude250711 a day ago ago
Why is Grok so surprisingly decent? Does lack of mainstream liberal-left censorship (replaced with Musky censorship) result in some sort of a weird performance boost?
[-]
- pupppet a day ago ago
  Is it decent, or does it game the tests? Really, would love to know..
- hagbard_c a day ago ago
  There's nothing weird about a model performing better when it is built to more closely relate to reality instead of an ideologically tainted version of such. I don't know how much Musk & Co. interfere with the fine tuning of the models but it is clear that this interference is far less heavy-handed than what the other actors do to their models.
  [-]
  - 01HNNWZ0MV43FF 17 hours ago ago
    How are the other ones tainted?
- slowmovintarget a day ago ago
  Yes, actually.
  Fewer fingers on the scale means the LLM gets to actually do its thing. GPT-4 with zero filtering was scary smart according to the red teams that were testing it. The version the public got had a lobe tied behind it's back.
  Having only Grok 3 to compare, and toying around with GPT-5... GPT-5 is pretty good.