24 comments

  • ganiszulfa 4 days ago ago

    Amazing project, and amazing write-up, I especially like the animations. What's the end goal here? Putting these TPUs in the consumer hands or edge devices?

  • jacquesm 4 days ago ago

    Sometimes it is the projects where you don't know that you really don't know what you are doing that are the most satisfying, kudos, amazing work you have done.

    • evxxan 4 days ago ago

      Thank you!

  • skybrian 4 days ago ago

    It's unclear to me what the end result is. Did you build real hardware or is it simulated somehow? If it's hardware, what kind and how did you make it?

    • jacquesm 4 days ago ago

      Verilog spec by the looks of it. So you should be able to make it work on an FPGA or if you happen to have a chip fab in your garage you might want to make your own silicon ;) I'd go the FPGA route.

    • antognini 4 days ago ago

      Based on the code in the repo it looks like they designed the chip in verilog and then ran it in a simulator. But if they have the verilog code in principle they could send it off to a fab and get real hardware back.

      • UncleOxidant 4 days ago ago

        Next step would be to try it out in an FPGA.

    • zhainya 4 days ago ago

      I feel like I missed a whole section somewhere. "Built a toy TPU". What does that mean? I have no idea what was actually "built" here.

      • evxxan 4 days ago ago

        By "toy TPU", we simulated forward pass + backprop on a minimal tpu-like accelerator.

    • evxxan 4 days ago ago

      all in simulation :)

  • airza 4 days ago ago

    What did you use to make the illustrations? It looks nice.

    • frutiger 4 days ago ago

      Not OP, but these look like Excalidraw.

  • zoobab 4 days ago ago

    Maybe try to build a proto with LiteX?

  • utopcell 4 days ago ago

    The Google team used Chisel instead of SystemVerilog. You could consider switching to that if it makes sense for your project.

    • FirmwareBurner 4 days ago ago

      >The Google team used Chisel instead of SystemVerilog.

      Not sure blindly copying whatever Google is doing is always the right idea for small projects.

      They have unlimited ad money and some quirky hiring practices, so they can afford to have development practices that go against HW industry norms, just for shits and giggles, without worrying about the costs.

  • UncleOxidant 4 days ago ago

    Have you tried it out in an FPGA?

    • evxxan 4 days ago ago

      Not yet! But that's our next step.

      • utopcell 4 days ago ago

        tang nano 20k. You can't find any cheaper fpga board than this.

        • UncleOxidant 3 days ago ago

          You can apparently use the open source yosys/nxtpnr tools with the tang nano 9k, but, unless something has changed recently, nxtpnr doesn't work with the 20K yet. However, I found the Gowin tools to work reasonably well (and definitely way less bloated than the Xilinx & Altera tools.)

        • addaon 4 days ago ago

          At a higher price point but with more capability, Digilent has a one-week 20% sale on their FPGA boards this week. Some good options (Artix 7 and Spartan 7) within spitting distance of $100.

          • UncleOxidant 3 days ago ago

            From what it looks like (Xilinx parts primarily) if I bought one of these boards I'd be stuck using either Altera or Xilinx tools. I think some spartan 7s work with yosys/nxtpnr, but not sure how well.

            • addaon 3 days ago ago

              Yep. The Xilinx tools are very, very good; but they're definitely proprietary.

              • UncleOxidant 2 days ago ago

                > The Xilinx tools are very, very good

                Ummm... no, that has not been my experience at all. I'd replace 'good' with 'buggy' in that sentence. And also very, very bloated - like 90GB bloated. I've had good experiences using yosys/nxtpnr/SymbiFlow, but that's kind of limited to the Lattice ICE40, ECP5 families and Quicklogic.

  • skyzouwdev 4 days ago ago

    This is super cool. The fact that you went in without hardware experience and still pushed through makes it even more impressive. I like the philosophy of trying the “hacky” way first instead of just copying existing designs—it’s probably the fastest path to real understanding. Curious, what was the hardest part where you almost gave up?