Categorical Foundations for Cute Layouts

(research.colfax-intl.com)

35 points | by charles_irl 3 days ago ago

6 comments

statusfailed 2 days ago ago
Really nice! Had a quick read, here's my quick summary:
- Arrays are typed `S : D` with shape S and strides D
- Each of `S` and `D` is a nested tuple (instead of the flat tuples one typically sees in a tensor framework)
- Together `S` and `D` define the layout of a tensor
- Not every layout is "tractable", but the tractable ones form a nice category
A really good exposition, my only criticism is that it's quite front-heavy- it would be nice to see a detailed example like in 2.3.8 earlier in the document; there is a lot of detail presented early that doesn't seem necessary to understand the core ideas.
Last comment: I suspect there is a connection to strictification[0], would love to know more if the authors see this!
[0]: in the sense i mean here: https://arxiv.org/pdf/2201.11738v3
bgavran 3 days ago ago
This is an interesting writeup, I wonder if the authors considered a categorical approach to representation of general applicative arrays (which might be tree-shaped), as described here (https://www.cs.ox.ac.uk/people/jeremy.gibbons/publications/a...) or here (https://github.com/bgavran/TensorType)
[-]
- godelski a day ago ago
  FYI attention wasn't originally purposed on AIAYN. Their main contribution was a fully transformer based network.
  You could argue they didn't invent dot product attention nor transformers but they definitely formalized those so I'll leave that nitpicking to Schmidhuber lol. But the other stuff, they say just as much in the paper. It's easy to pass credit over to the ones who popularized a technique rather than the many people who developed it.
andersa 3 days ago ago
Suggest watching this as an intro to what this is about! https://www.youtube.com/watch?v=ufa4pmBOBT8
jonathrg 3 days ago ago
Needs to be capitalized as CuTe
carterschonwald 3 days ago ago
This is neat. Reminds me that I am like a decade overdue to write up my own work in this part of array computation