The biggest issue is simply: an HDL is not a programming language.
As such, making an HDL look like a programming language doesn't increase my expressiveness or productivity all that much. (Side note: this assumes VHDL or SystemVerilog--old-school Verilog is terribly anemic in ways that SystemVerilog mostly fixed)
Now, my testbench, on the other hand, is dramatically helped by being in a real programming language with the appropriate constructs. However, we already have cocotb (in Python!) which fits the bill there and interfaces directly with my simulators without a translation step.
That is the point of Amaranth (and migen). It's not HLS, it doesn't try to generate logic from normal python code. It's an HDL implemented in python, not trying to look like python code.
In my opinion, it's even more of an HDL than SV or VHDL: It has native concepts of Clock Domains, reset handling and e.g. blockram, it describes the hardware. There is no missmatch with two different, weird assignment types. It's occupied with with accurately covering available/commonly used hardware constructs with commonly used features, not introducing useless abstractions that don't help in practice (trivial example: Your clock signals are not just another signal that mess up timing everywhere if you try to use them that way).
But on the other hand, there is no weird, half-useless metaprogramming/simulation layer with almost the same syntax but tricky rules that differ. That part is proper python.
Thus it's easy to write code that is flexible over different interface types (i.e. one async fifo that can be used for almost all data or an width adapter that mostly works automatically), generate memory-mapped registers that are correctly hooked up, generate wide e.g. CRC operations that are optimized in python code before making HDL out of it as Vivado barfs on them (which I've also seen done with python code that generates systemverilog using string concatenation).
I think my point is just that SystemVerilog isn't a good HDL. It makes it hard to (correctly) describe hardware and makes reuse often harder than it needs to be.
Regarding the intermediate translation step: Amaranth has it's own native simulator and for system-level simulation interfaces directly with yosys for cxxrtl or synthesis for parts directly supported by yosys, bypassing the need for a yosys-> verilog conversion. It's "only" needed for most commercial tools, that can only process SV and VHDL.
I find myself deeply skeptical of much of the open source FPGA movement.
Most of those efforts stem from the underlying notion that “…this is all a problem with the tooling!”
This approaches the problem space from a very software-centric lens. Fundamentally, gateware design isn’t software. It’s wiring together logic gates if you really boil it down to fundamentals. Treating it as a tooling problem is to misconstrue how much you know. Plainly: no open source toolchain is going to have insight into Xilinx’s internal fanout or propagation delay specs. You’re reliant on Xilinx to encode these into their tools for you.
As a result: “Vendor tools are God in FPGA land. You don’t go against God.” (Quoted from the staff FPGA engineer on my team.)
I've found there's a fundamentally different attitude among FPGA engineers compared to software engineers for better or worse.
I think the "vendor tools are god" attitude is overall negative. The vendor tools ARE leagues better than the open source alternatives, but it doesn't mean the open source stuff is just for toy projects.
For example, Vivado is a monolithic pain in the ass. If I want to use an FPGA as electrical super glue for a project, I don't want to be downloading 150GB onto my machine. I think the open source tooling is particularly useful for smaller projects, and the general attitude towards the tooling is really frustrating.
The research that has gone into the RE the bitstream structure and the overall structure of FPGA fabric is extremely impressive. Vendor IPs are often bloated to get you to buy the bigger chip, where open source IPs take up much less of the fabric.
There's give and take but from my perspective a big problem is the tooling.
> The vendor tools ARE leagues better than the open source alternatives, but it doesn't mean the open source stuff is just for toy projects
It’s not a question of developer experience. We heartily agree that using Vivado sucks. My point is that there’s no way around using Vivado (or the Altera equivalent) if you want to use the most useful parts of modern FPGAs.
It’s simply not possible to do things like access the GTH transceivers or custom MAC blocks using open source FPGA tools. These are table stakes capabilities to make these chips useful. You can only use vendor tools to access them.
I suspect - but cannot prove - that Xilinx has agreements with IP vendors about how much they are allowed to reveal about the guts of the different devices they integrate into Zynq family die.
I also suspect that Xilinx has considerable intellectual property investment into the underlying architecture of their PL. Making 600MHz programmable fabric is no mean technical feat. Open sourcing their tools is probably something they judge a risk to revealing those technical advantages.
100% yes. The actual format of the binary that the device interprets is treated as a trade secret, and the vendor tooling is the only documented way to target code at those parts.
This has various implications, such as smaller FPGAs being literally tooling-locked versions of larger ones, important features like partial reconfiguration being supported by the hardware but a huge pain to use from the logic, or vendor tools not supporting some language construct and you're stuck with that tool. (Admittedly they are far and away better than FOSS tooling for language support.)
You can tell the veteran status of FPGA devs by the quality of their rants about the tools. The big FPGA companies have no quality metrics for developer experience. You should be able to make an LED blink within a minute of powering up a board and not after a day of downloading and installing stuff. It used to be possible to quickly start with Vivado on AWS cloud, and I was using that workflow for years, although recent licensing changes presented a speed-bump there, and I ended up going with a local install for my recent project.
Even once you get that LED blinking, changing a clock speed for that blinking LED should be near instantaneous but more likely requires a rebuilding the whole project. Fundamentally the vendors don’t view their chips as something designed to run programs, and this legacy hardware design mentality plagues their whole business.
Something important here: Xilinx could and should have been where NVidia is today. They were certainly aware of the competitive accelerated computing market as early as 2005, and fundamentally failed to make a software architecture competitive with CUDA.
Before CUDA even existed I interned at Xilinx working on the beginnings of their HLS C compiler. My (decade older) fraternity brother led the C compiler team at Altera. We almost went into making a spreadsheet compiler for FPGA (my masters thesis) together but 2007 ended up being a terrible year to sell accelerated computing to Wall Street.
Xilinx never had hardware that was even remotely capable of competing with Nvidia. So I don't think it's solely a software problem - they literally have never developed hardware that is programmable or general purpose enough. Even their versal hardware today is hideously difficult to program and has a very FPGA centric work flow.
This isn’t the full story though, like I (professionally, as a consultant) analyzed GOPs/$ and /Watt for big multi chip GPU or FPGA systems from 2006-2011.
Xilinx routinely had more I/O (SerDes, 100/200/400G MACs on-die) and at times now more HBM bandwidth than contemporary GPUs. Also deterministic latency and perfectly acceptable DSP primitives.
The gap has always been the software.
Of course NVidia wasn’t such an obvious hit either, the flubbed the tablet market due to yield issues and ultimately it really only went exponential in 2014. I invested heavily in NVidia 2007-2014 because of the CUDA edge they had, but sold my $40K of stock at my cost-basis.
I currently do DSP for radar, and implemented the same system on FPGA and in CUDA 2020-2023. I know as a fact that the FFT performance of an $9000 FPGA was equal to a $16000 A100 that also needed a $10000 computer in 2022 (the types on FPGA were fixed point instead of float so no apples-to-apples but definitely application equivalent)
It depends on what you want to do. FPGAs excel in periodic "always on" workloads that need deterministic timing and low latency. If you don't have that and just care about total throughput and don't care about energy efficiency, then Nvidia will sell you more tflops per chip.
The energy efficiency of FPGAs can't be understated. Reducing the clock and voltage to levels comparable to an FPGA will kill your GPU's tflops and the control overhead and the energy spent on data movement are unavoidable in a GPU.
I didn’t say they were benevolent or forgiving deities. XD
The statement “vendor tools are god” is a statement that they are all powerful, and not something you can work against. It’s not pleasant, but it’s a necessary evil.
You’re not going to be able to access any of the most cutting edge features of Xilinx or Intel chips without the vendor tools. Simple as that. They have no interest in open sourcing the tools. Fighting the vendors to change this is trying to fight a force you can’t fight against and win.
Why can't it be? Of course, designing hardware is different from writing software, but it's important to realize that these new age HDLs aren't really HDLs in the strictest a sense but rather languages that let you cleanly write RTL generator programs.
It's exactly I love most about Chisel: it's like having a Verilog preprocessor which wasn't hacked together in twenty minutes as an afterthought.
> these new age HDLs aren't really HDLs in the strictest a sense but rather languages that let you cleanly write RTL generator programs
At what point does the abstraction start to help your productivity in developing gateware, rather than hamper it?
I’ve heard more than a few horror stories from gateware devs trying to untangle HDL generated by something like Matlab, Simulink, or even HLS tools like Vitis.
> It's exactly I love most about Chisel
Would love to hear more about your experience with Chisel. I’ve learned just a bit about it from going to conferences at MIT LL. Seems interesting, but at the same time, I’m also an avowed skeptic when it comes to adding more gateware abstractions on top of raw HDL.
The main problem for me here is that this is embedding some kind of DSL in an existing programming language. You can see the shortcomings really clearly with amaranth. A ton of things that you would expect to be just an operator are function calls. Literally if you want to write an conditional statement you have an if method. You mark states as Strings. you want to append some bytes? Cat instead of just python ++. When you are writing amaranth you are really writing two languages at once. You are writing the amaranth DSL while writing python.
There are however better takes. Clash, Spade, Silice are such much better in this regard. But I'd still take Amaranth over the big three.
[Not a language designer or anything, but] I dunno, keeping some distinction between Python and the DSL seems useful, and probably makes it simpler to reason about. Too much syntactic sugar, and I can imagine it becomes easier to confuse "thing that is generating the RTL" from "the RTL".
I think of Amaranth as "a way of using Python to organize/generate/simulate RTL." Being embedded in Python is part of the utility behind it!
To me it would be equivalent of saying "Yeah C sucks and is verbose, Let's make a C generator using python. Look ma I can even make generic data structures and function".
That is to say I think the generator approach is a dead end. A local optimum before we move to real languages. Sure it's a lot better than bare Verillog/SV/VHDL but that bar is very low. I believe gaving native language contructs that supports designing and simulating hardware is much more powerful.
> If you've ever been burned by Verilog's "works in sim, breaks in synthesis" gotchas
This is LLM phrasing. It's trying to be witty and relatable about an experience basically no person has ever had - or at least, those people being hardware engineers, they usually don't try to be witty and relatable in forum comments.
I don't know though, the comment as a whole doesn't feel AI-generated, but maybe AI-assisted.
Apart from the telltale direct paraphrasing of the Amaranth home page, the discourse about "works in sim, breaks in synthesis" can be regurgitated from any mildly disappointed essay about Verilog and feels plausibly AI-generated to me.
It's been a while since I've done this stuff, but VHDL seemed like this to me: that generally if it compiles, it synthesises. I really battled with Verilog!
The biggest issue is simply: an HDL is not a programming language.
As such, making an HDL look like a programming language doesn't increase my expressiveness or productivity all that much. (Side note: this assumes VHDL or SystemVerilog--old-school Verilog is terribly anemic in ways that SystemVerilog mostly fixed)
Now, my testbench, on the other hand, is dramatically helped by being in a real programming language with the appropriate constructs. However, we already have cocotb (in Python!) which fits the bill there and interfaces directly with my simulators without a translation step.
That is the point of Amaranth (and migen). It's not HLS, it doesn't try to generate logic from normal python code. It's an HDL implemented in python, not trying to look like python code.
In my opinion, it's even more of an HDL than SV or VHDL: It has native concepts of Clock Domains, reset handling and e.g. blockram, it describes the hardware. There is no missmatch with two different, weird assignment types. It's occupied with with accurately covering available/commonly used hardware constructs with commonly used features, not introducing useless abstractions that don't help in practice (trivial example: Your clock signals are not just another signal that mess up timing everywhere if you try to use them that way).
But on the other hand, there is no weird, half-useless metaprogramming/simulation layer with almost the same syntax but tricky rules that differ. That part is proper python.
Thus it's easy to write code that is flexible over different interface types (i.e. one async fifo that can be used for almost all data or an width adapter that mostly works automatically), generate memory-mapped registers that are correctly hooked up, generate wide e.g. CRC operations that are optimized in python code before making HDL out of it as Vivado barfs on them (which I've also seen done with python code that generates systemverilog using string concatenation).
I think my point is just that SystemVerilog isn't a good HDL. It makes it hard to (correctly) describe hardware and makes reuse often harder than it needs to be.
Regarding the intermediate translation step: Amaranth has it's own native simulator and for system-level simulation interfaces directly with yosys for cxxrtl or synthesis for parts directly supported by yosys, bypassing the need for a yosys-> verilog conversion. It's "only" needed for most commercial tools, that can only process SV and VHDL.
I find myself deeply skeptical of much of the open source FPGA movement.
Most of those efforts stem from the underlying notion that “…this is all a problem with the tooling!”
This approaches the problem space from a very software-centric lens. Fundamentally, gateware design isn’t software. It’s wiring together logic gates if you really boil it down to fundamentals. Treating it as a tooling problem is to misconstrue how much you know. Plainly: no open source toolchain is going to have insight into Xilinx’s internal fanout or propagation delay specs. You’re reliant on Xilinx to encode these into their tools for you.
As a result: “Vendor tools are God in FPGA land. You don’t go against God.” (Quoted from the staff FPGA engineer on my team.)
I've found there's a fundamentally different attitude among FPGA engineers compared to software engineers for better or worse.
I think the "vendor tools are god" attitude is overall negative. The vendor tools ARE leagues better than the open source alternatives, but it doesn't mean the open source stuff is just for toy projects.
For example, Vivado is a monolithic pain in the ass. If I want to use an FPGA as electrical super glue for a project, I don't want to be downloading 150GB onto my machine. I think the open source tooling is particularly useful for smaller projects, and the general attitude towards the tooling is really frustrating.
The research that has gone into the RE the bitstream structure and the overall structure of FPGA fabric is extremely impressive. Vendor IPs are often bloated to get you to buy the bigger chip, where open source IPs take up much less of the fabric.
There's give and take but from my perspective a big problem is the tooling.
> The vendor tools ARE leagues better than the open source alternatives, but it doesn't mean the open source stuff is just for toy projects
It’s not a question of developer experience. We heartily agree that using Vivado sucks. My point is that there’s no way around using Vivado (or the Altera equivalent) if you want to use the most useful parts of modern FPGAs.
It’s simply not possible to do things like access the GTH transceivers or custom MAC blocks using open source FPGA tools. These are table stakes capabilities to make these chips useful. You can only use vendor tools to access them.
I suspect - but cannot prove - that Xilinx has agreements with IP vendors about how much they are allowed to reveal about the guts of the different devices they integrate into Zynq family die.
I also suspect that Xilinx has considerable intellectual property investment into the underlying architecture of their PL. Making 600MHz programmable fabric is no mean technical feat. Open sourcing their tools is probably something they judge a risk to revealing those technical advantages.
The parent comment made it sound like vendors intentionally withhold information to limit how well the open source competitors work.
Is that happening? If so, it's a clear barrier to OSS adoption/contribution. A shame.
100% yes. The actual format of the binary that the device interprets is treated as a trade secret, and the vendor tooling is the only documented way to target code at those parts.
This has various implications, such as smaller FPGAs being literally tooling-locked versions of larger ones, important features like partial reconfiguration being supported by the hardware but a huge pain to use from the logic, or vendor tools not supporting some language construct and you're stuck with that tool. (Admittedly they are far and away better than FOSS tooling for language support.)
They certainly do withhold information. See my other comment for my speculation as to why.
I suspect it has nothing to do with the OSSW community; they give their software tools away.
You can tell the veteran status of FPGA devs by the quality of their rants about the tools. The big FPGA companies have no quality metrics for developer experience. You should be able to make an LED blink within a minute of powering up a board and not after a day of downloading and installing stuff. It used to be possible to quickly start with Vivado on AWS cloud, and I was using that workflow for years, although recent licensing changes presented a speed-bump there, and I ended up going with a local install for my recent project.
Even once you get that LED blinking, changing a clock speed for that blinking LED should be near instantaneous but more likely requires a rebuilding the whole project. Fundamentally the vendors don’t view their chips as something designed to run programs, and this legacy hardware design mentality plagues their whole business.
Something important here: Xilinx could and should have been where NVidia is today. They were certainly aware of the competitive accelerated computing market as early as 2005, and fundamentally failed to make a software architecture competitive with CUDA.
Before CUDA even existed I interned at Xilinx working on the beginnings of their HLS C compiler. My (decade older) fraternity brother led the C compiler team at Altera. We almost went into making a spreadsheet compiler for FPGA (my masters thesis) together but 2007 ended up being a terrible year to sell accelerated computing to Wall Street.
Xilinx never had hardware that was even remotely capable of competing with Nvidia. So I don't think it's solely a software problem - they literally have never developed hardware that is programmable or general purpose enough. Even their versal hardware today is hideously difficult to program and has a very FPGA centric work flow.
This isn’t the full story though, like I (professionally, as a consultant) analyzed GOPs/$ and /Watt for big multi chip GPU or FPGA systems from 2006-2011.
Xilinx routinely had more I/O (SerDes, 100/200/400G MACs on-die) and at times now more HBM bandwidth than contemporary GPUs. Also deterministic latency and perfectly acceptable DSP primitives.
The gap has always been the software.
Of course NVidia wasn’t such an obvious hit either, the flubbed the tablet market due to yield issues and ultimately it really only went exponential in 2014. I invested heavily in NVidia 2007-2014 because of the CUDA edge they had, but sold my $40K of stock at my cost-basis.
I currently do DSP for radar, and implemented the same system on FPGA and in CUDA 2020-2023. I know as a fact that the FFT performance of an $9000 FPGA was equal to a $16000 A100 that also needed a $10000 computer in 2022 (the types on FPGA were fixed point instead of float so no apples-to-apples but definitely application equivalent)
It depends on what you want to do. FPGAs excel in periodic "always on" workloads that need deterministic timing and low latency. If you don't have that and just care about total throughput and don't care about energy efficiency, then Nvidia will sell you more tflops per chip.
The energy efficiency of FPGAs can't be understated. Reducing the clock and voltage to levels comparable to an FPGA will kill your GPU's tflops and the control overhead and the energy spent on data movement are unavoidable in a GPU.
really? every fpga user I've ever seen bitches extensively about the vendor software
this sounds a lot like what people used to say about shitty expensive vendor software for MCUs
I didn’t say they were benevolent or forgiving deities. XD
The statement “vendor tools are god” is a statement that they are all powerful, and not something you can work against. It’s not pleasant, but it’s a necessary evil.
You’re not going to be able to access any of the most cutting edge features of Xilinx or Intel chips without the vendor tools. Simple as that. They have no interest in open sourcing the tools. Fighting the vendors to change this is trying to fight a force you can’t fight against and win.
Why can't it be? Of course, designing hardware is different from writing software, but it's important to realize that these new age HDLs aren't really HDLs in the strictest a sense but rather languages that let you cleanly write RTL generator programs.
It's exactly I love most about Chisel: it's like having a Verilog preprocessor which wasn't hacked together in twenty minutes as an afterthought.
> these new age HDLs aren't really HDLs in the strictest a sense but rather languages that let you cleanly write RTL generator programs
At what point does the abstraction start to help your productivity in developing gateware, rather than hamper it?
I’ve heard more than a few horror stories from gateware devs trying to untangle HDL generated by something like Matlab, Simulink, or even HLS tools like Vitis.
> It's exactly I love most about Chisel
Would love to hear more about your experience with Chisel. I’ve learned just a bit about it from going to conferences at MIT LL. Seems interesting, but at the same time, I’m also an avowed skeptic when it comes to adding more gateware abstractions on top of raw HDL.
I think what they mean is that it's just a different thing. It's a description of a physical machine vs. specification of algorithm.
> As such, making an HDL look like a programming language doesn't increase my expressiveness or productivity all that much
Ok, now try Chisel or SpinalHDL - which lean on a high-powered type system/compiler - and come back to me.
SystemVerilog/VHDL do not hold a candle to the level of productivity or correctness you can achieve in these Scala-based HDL tools.
The main problem for me here is that this is embedding some kind of DSL in an existing programming language. You can see the shortcomings really clearly with amaranth. A ton of things that you would expect to be just an operator are function calls. Literally if you want to write an conditional statement you have an if method. You mark states as Strings. you want to append some bytes? Cat instead of just python ++. When you are writing amaranth you are really writing two languages at once. You are writing the amaranth DSL while writing python.
There are however better takes. Clash, Spade, Silice are such much better in this regard. But I'd still take Amaranth over the big three.
[Not a language designer or anything, but] I dunno, keeping some distinction between Python and the DSL seems useful, and probably makes it simpler to reason about. Too much syntactic sugar, and I can imagine it becomes easier to confuse "thing that is generating the RTL" from "the RTL".
I think of Amaranth as "a way of using Python to organize/generate/simulate RTL." Being embedded in Python is part of the utility behind it!
To me it would be equivalent of saying "Yeah C sucks and is verbose, Let's make a C generator using python. Look ma I can even make generic data structures and function".
That is to say I think the generator approach is a dead end. A local optimum before we move to real languages. Sure it's a lot better than bare Verillog/SV/VHDL but that bar is very low. I believe gaving native language contructs that supports designing and simulating hardware is much more powerful.
[flagged]
It appears that your entire profile is simply LLM generated comments, including this comment right here. Please do not do this on HN.
The parent comment absolutely has the LLM stench.
Appears? How so?
> If you've ever been burned by Verilog's "works in sim, breaks in synthesis" gotchas
This is LLM phrasing. It's trying to be witty and relatable about an experience basically no person has ever had - or at least, those people being hardware engineers, they usually don't try to be witty and relatable in forum comments.
I don't know though, the comment as a whole doesn't feel AI-generated, but maybe AI-assisted.
Apart from the telltale direct paraphrasing of the Amaranth home page, the discourse about "works in sim, breaks in synthesis" can be regurgitated from any mildly disappointed essay about Verilog and feels plausibly AI-generated to me.
"The result: a design that passes simulation is far more likely to synthesize identically, averting costly back-end surprises"
This is the sort of phrasing that either a committee of marketing people wrote up, or an LLM.
It's been a while since I've done this stuff, but VHDL seemed like this to me: that generally if it compiles, it synthesises. I really battled with Verilog!
But what is it for? It would be impressive if they used this to develop their gravitons but I find that hard to believe. What's the use case?
https://glasgow-embedded.org/
Which uses a single Python codebase to implement both the gateware (as Amaranth) and the software which interacts with it.
https://greatscottgadgets.com/cynthion/ is another in-production user of it, a USB protocol analyzer and tool.
Amaranth is basically SystemC but in Python (and it's much easier to use and works better)