The spec rarely has enough detail to deterministically create a product, so current vibecoding is a lottery.
So we generate one or many changesets (in series or in parallel) then iterate on one. We force the “chosen one” to be the one true codification of the spec + the other stuff we didn’t write down anywhere. Call it luck driven development.
But there’s another way.
If we keep starting fresh from the spec, but keep adding detail after detail, regenerating from scratch each time.. and the LLM has enough room in context to handle a detailed spec AND produce output, and the result is reasonably close to deterministic because the LLM makes “reasonable choices” for everything underspecified.. that’s a paradigm shift.
Well, it’s really a return to the old-fashioned role of an analyst coming up with a data dictionary and a detailed spec. But in practice how often did that work as intended?
> The spec rarely has enough detail to deterministically create a product, so current vibecoding is a lottery.
How is that different from how it worked without LLMs? The only difference is that we can now get a failing product faster and iterate.
> If we keep starting fresh from the spec, but keep adding detail after detail, regenerating from scratch each time..
This sounds like the worst way to use AI. LLMs can work existing code, whether it was generated by an LLM or written by human. It can even work on code that has been edited by a human, there is no good reason to not be iterative when using an LLM to develop code, and plenty of good reasons to be iterative.
For me it just depends. If the response to my prompt shows the model misunderstood something, then I go back and retry the previous prompt again. Otherwise the "wrong ideas" that it comes up with persist in the context and seem to sabotage all future results. The most of this sort of coding I've done was in Google's AI studio, and I often do have a context that spans dozens of messages, but I always rewind if something goes off-track. Basically any time I'm about to make a difficult request, I clone the entire context/app to a new one so I can roll back [cleanly] whenever necessary.
If you fix something it sticks, the AI won't keep making the same mistake, it won't change the code that already exists if you ask it not to. It actually ONLY works well when you are doing iterative changes and not used as a pure code generator, actually, AI's one-shot performance is kind of crap. A mistake happens, you point it out to the LLM and ask it to update the code and the instructions used to create the code in tandem. Or you just ask it to fix the code once. You add tests, partially generated by the AI and curated by a human, the AI runs the tests and fixes the code if they fail (or fixes the tests).
Yes, I believe the paradigm shift will be to not treat the code as particularly valuable, just like binaries today. Instead the value is in the input that can generate the code.
Without understanding the level of detail required, which we do not yet know, we cannot say.
When I think of English specifications that (generally) aim to be very precise, I think of laws. Laws do not read like plain, common language, because plain common language is bad at being specific. Interpreting and creating laws requires an education on par with that required of an engineer, often greater.
Laws being unreadable is largely an Enlish-language problem zo. I have no problem reading them in my native language. Not requiring massive context size of case law makes things easier still. Big part of being a lawyer is having the same context with all the other lawyers and knowing what was already decided and what possible new interpretation is likely to be accepted by everyone else.
> Big part of being a lawyer is having the same context with all the other lawyers and knowing what was already decided and what possible new interpretation is likely to be accepted by everyone else.
And to create software specifications with language, the same thing will need to happen. You’ll need shared terminology and context that the LLM will correctly and consistently interpret, and that other engineers will understand. This means that very specific meanings become attached to certain words and phrases. Without this, you aren’t making precise specifications. To create and interpret these specifications will require learning the language of the specs. It may well still be easier than code - but then it would also be less precise.
And this could end up looking more like mathematics notation than English. For the same reason mathematicians opt to use specialized notation to communicate with greater precision than natural language.
> “As an aside, I think there may be an increased reason to use dynamic interpreted languages for the intermediate product. I think it will likely become mainstream in future LLM programming systems to make live changes to a running interpreted program based on prompts.”
Curious whether the author is envisioning changing configuration of running code on the fly (which shouldn’t require an interpreted language)? Or whether they are referring to changing behavior on the fly?
Assuming the latter, and maybe setting the LLM aspect aside: is there any standard safe programming paradigm that would enable this? I’m aware of Erlang (message passing) and actor pattern systems, but interpreted languages like Python don’t seem to be ideal for these sorts of systems. I could be totally wrong here, just trying to imagine what the author is envisioning.
I think at some point in the future, you'll be able to reconfigure programs just by talking to your LLM-OS: Want the System Clock to show seconds? Just ask your OS to make the change. Need a calculator app that can do derivatives? Just ask your OS to add that feature.
"Configuration" implies a preset, limited number of choices; dynamic languages allow you to rewrite the entire application in real time.
I agree that as LLMs approach the capabilities of human programmers, the entire software paradigm needs to change radically. Humans at that point should just ask their computers in human language to introduce a new visualization or report or input screen and the computer just creates it near instantly.
Of course this requires a huge architecture change from OS level and up.
Maybe I'm missing it, but when my calculator app gets a new derivatives feature, how am I supposed to check that it's implemented correctly? End user one-shot of bug free code seems like a different technology than what LLMs offer.
Yeah I don't see how LLMs are ever supposed to be reliable enough for this, but they did say "at some point in the future", which leaves room for another (better) technology.
I was envisioning the latter (changing behavior on the fly). Think the hot-reload that Flutter/Dart provides, but on steroids and guided by an LLM.
Interpretation isn’t strictly required, but I think runtimes that support hot-swap / reloadable boundaries (often via interpretation or JIT) make this much easier in practice.
This is another pointless article about LLM's... vibe coding is the present not the future, the only sad part of all of it is that LLM's is killing something important: code documentation.
Every single documentation out there for new libs is AI generated and that is feed again into LLMs with MCP/Skills servers, the age of the RTFM gang is over sigh
Tons of people throw this argument out on social media. "You keep using assembly while I go up an abstraction layer by using AI."
I can only assume people saying that don't even know what assembly is. Actually, as I typed that out I remembered seeing one comment where someone said "hexcode" instead of assembly (lol)
The spec rarely has enough detail to deterministically create a product, so current vibecoding is a lottery.
So we generate one or many changesets (in series or in parallel) then iterate on one. We force the “chosen one” to be the one true codification of the spec + the other stuff we didn’t write down anywhere. Call it luck driven development.
But there’s another way.
If we keep starting fresh from the spec, but keep adding detail after detail, regenerating from scratch each time.. and the LLM has enough room in context to handle a detailed spec AND produce output, and the result is reasonably close to deterministic because the LLM makes “reasonable choices” for everything underspecified.. that’s a paradigm shift.
In what environment do you run such tests? Do you have a script for it, or do you have a UI that manages the process?
Well, it’s really a return to the old-fashioned role of an analyst coming up with a data dictionary and a detailed spec. But in practice how often did that work as intended?
> The spec rarely has enough detail to deterministically create a product, so current vibecoding is a lottery.
How is that different from how it worked without LLMs? The only difference is that we can now get a failing product faster and iterate.
> If we keep starting fresh from the spec, but keep adding detail after detail, regenerating from scratch each time..
This sounds like the worst way to use AI. LLMs can work existing code, whether it was generated by an LLM or written by human. It can even work on code that has been edited by a human, there is no good reason to not be iterative when using an LLM to develop code, and plenty of good reasons to be iterative.
For me it just depends. If the response to my prompt shows the model misunderstood something, then I go back and retry the previous prompt again. Otherwise the "wrong ideas" that it comes up with persist in the context and seem to sabotage all future results. The most of this sort of coding I've done was in Google's AI studio, and I often do have a context that spans dozens of messages, but I always rewind if something goes off-track. Basically any time I'm about to make a difficult request, I clone the entire context/app to a new one so I can roll back [cleanly] whenever necessary.
If you fix something it sticks, the AI won't keep making the same mistake, it won't change the code that already exists if you ask it not to. It actually ONLY works well when you are doing iterative changes and not used as a pure code generator, actually, AI's one-shot performance is kind of crap. A mistake happens, you point it out to the LLM and ask it to update the code and the instructions used to create the code in tandem. Or you just ask it to fix the code once. You add tests, partially generated by the AI and curated by a human, the AI runs the tests and fixes the code if they fail (or fixes the tests).
[delayed]
>How is that different from how it worked without LLMs?
I won't lie and say "That's a great idea" when it isn't.
Yes, I believe the paradigm shift will be to not treat the code as particularly valuable, just like binaries today. Instead the value is in the input that can generate the code.
At that level of detail, how far removed are we from “programming”?
Without understanding the level of detail required, which we do not yet know, we cannot say.
When I think of English specifications that (generally) aim to be very precise, I think of laws. Laws do not read like plain, common language, because plain common language is bad at being specific. Interpreting and creating laws requires an education on par with that required of an engineer, often greater.
Laws being unreadable is largely an Enlish-language problem zo. I have no problem reading them in my native language. Not requiring massive context size of case law makes things easier still. Big part of being a lawyer is having the same context with all the other lawyers and knowing what was already decided and what possible new interpretation is likely to be accepted by everyone else.
> Big part of being a lawyer is having the same context with all the other lawyers and knowing what was already decided and what possible new interpretation is likely to be accepted by everyone else.
And to create software specifications with language, the same thing will need to happen. You’ll need shared terminology and context that the LLM will correctly and consistently interpret, and that other engineers will understand. This means that very specific meanings become attached to certain words and phrases. Without this, you aren’t making precise specifications. To create and interpret these specifications will require learning the language of the specs. It may well still be easier than code - but then it would also be less precise.
And this could end up looking more like mathematics notation than English. For the same reason mathematicians opt to use specialized notation to communicate with greater precision than natural language.
Far!
But without the need to “program” you can focus on the end user and better understand their needs - which is super exciting.
> “As an aside, I think there may be an increased reason to use dynamic interpreted languages for the intermediate product. I think it will likely become mainstream in future LLM programming systems to make live changes to a running interpreted program based on prompts.”
Curious whether the author is envisioning changing configuration of running code on the fly (which shouldn’t require an interpreted language)? Or whether they are referring to changing behavior on the fly?
Assuming the latter, and maybe setting the LLM aspect aside: is there any standard safe programming paradigm that would enable this? I’m aware of Erlang (message passing) and actor pattern systems, but interpreted languages like Python don’t seem to be ideal for these sorts of systems. I could be totally wrong here, just trying to imagine what the author is envisioning.
I think at some point in the future, you'll be able to reconfigure programs just by talking to your LLM-OS: Want the System Clock to show seconds? Just ask your OS to make the change. Need a calculator app that can do derivatives? Just ask your OS to add that feature.
"Configuration" implies a preset, limited number of choices; dynamic languages allow you to rewrite the entire application in real time.
I agree that as LLMs approach the capabilities of human programmers, the entire software paradigm needs to change radically. Humans at that point should just ask their computers in human language to introduce a new visualization or report or input screen and the computer just creates it near instantly.
Of course this requires a huge architecture change from OS level and up.
Maybe I'm missing it, but when my calculator app gets a new derivatives feature, how am I supposed to check that it's implemented correctly? End user one-shot of bug free code seems like a different technology than what LLMs offer.
Yeah I don't see how LLMs are ever supposed to be reliable enough for this, but they did say "at some point in the future", which leaves room for another (better) technology.
I was envisioning the latter (changing behavior on the fly). Think the hot-reload that Flutter/Dart provides, but on steroids and guided by an LLM.
Interpretation isn’t strictly required, but I think runtimes that support hot-swap / reloadable boundaries (often via interpretation or JIT) make this much easier in practice.
Smalltalk, mumps?
The analogy to IDE templates seems more compelling.
This is another pointless article about LLM's... vibe coding is the present not the future, the only sad part of all of it is that LLM's is killing something important: code documentation.
Every single documentation out there for new libs is AI generated and that is feed again into LLMs with MCP/Skills servers, the age of the RTFM gang is over sigh
"Many have compared the advancements in LLMs for software development to the improvements in abstraction that came with better programming languages."
Where can I see examples of this?
Comments comparing LLMs to just another level on the abstraction ladder are fairly commonplace:
https://news.ycombinator.com/item?id=46439753
https://news.ycombinator.com/item?id=46369114
https://news.ycombinator.com/item?id=46366864
Tons of people throw this argument out on social media. "You keep using assembly while I go up an abstraction layer by using AI."
I can only assume people saying that don't even know what assembly is. Actually, as I typed that out I remembered seeing one comment where someone said "hexcode" instead of assembly (lol)