MyTorch – Minimalist autograd in 450 lines of Python

(github.com)

93 points | by iguana2000 a day ago ago

17 comments

brandonpelfrey 14 hours ago ago
Having written a slightly more involved version of this recently myself I think you did a great job of keeping this compact while still readable. This style of library requires some design for sure.
Supporting higher order derivatives was also something I considered, but it’s basically never needed in production models from what I’ve seen.
[-]
- iguana2000 8 hours ago ago
  Thanks! I agree about the style
jerkstate a day ago ago
Karpathy’s micrograd did it first (and better); start here: https://karpathy.ai/zero-to-hero.html
[-]
- alkh a day ago ago
  Imho, we should let people experiment as much as they want. Having more examples is better than less. Still, thanks for the link for the course, this is a top-notch one
- iguana2000 18 hours ago ago
  Karpathy's material is excellent! This was a project I made for fun, and hopefully provides a different perspective on how this can look
- richard_chase a day ago ago
  Harsh.
- whattheheckheck a day ago ago
  Why is it better
  [-]
  - forgotpwd16 17 hours ago ago
    Cleaner, more straightforward, more compact code, and considered complete in its scope (i.e. implement backpropagation with a PyTorch-y API and train a neural network with it). MyTorch appears to be an author's self-experiment without concrete vision/plan. This is better for author but worse for outsiders/readers.
    P.S. Course goes far beyond micrograd, to makemore (transfomers), minbpe (tokenization), and nanoGPT (LLM training/loading).
  - tfsh a day ago ago
    Because it's an acclaimed, often cited course by a preeminent AI Researcher (and founding member of OAI) rather than four undocumented python files.
    [-]
    - gregjw 21 hours ago ago
      it being acclaimed is a poor measure of success, theres always room for improvement, how about some objective comparisons?
    - geremiiah 17 hours ago ago
      Ironically the reason Karpathy's is better is because he livecoded it and I can be sure it's not some LLM vomit. Unfortunately, we are now indundated with newbies posting their projects/tutorials/guides in the hopes that doing so will catch the eye of a recuiter and land them a high paying AI job. That's not so bad in itself except for the fact that most of these people are completely clueless and posting AI slop.
      [-]
      - iguana2000 16 hours ago ago
        Haha, couldn't agree with you more. This, however, isn't AI slop. You can see in the commit history that this is from 3 years ago
    - nurettin a day ago ago
      Objective measures like branch depth, execution speed, memory use and correctness of the results be damned.
      [-]
      - CamperBob2 21 hours ago ago
        Karpathy's implementation is explicitly for teaching purposes. It's meant to be taken in alongside his videos, which are pretty awesome.
khushiyant 17 hours ago ago
Better readme would be way to go
[-]
- CamperBob2 11 hours ago ago
  In iguana2000's defense, the code is highly self-documenting.
  It arguably reads cleaner than Karpathy's in some respects, as he occasionally gets a little ahead of his students with his '1337 Python skillz.
jjzkkj a day ago ago
HmcKk