Categorical Foundations for Cute Layouts

(research.colfax-intl.com)

35 points | by charles_irl 3 days ago ago

6 comments

  • statusfailed 2 days ago ago

    Really nice! Had a quick read, here's my quick summary:

    - Arrays are typed `S : D` with shape S and strides D

    - Each of `S` and `D` is a nested tuple (instead of the flat tuples one typically sees in a tensor framework)

    - Together `S` and `D` define the layout of a tensor

    - Not every layout is "tractable", but the tractable ones form a nice category

    A really good exposition, my only criticism is that it's quite front-heavy- it would be nice to see a detailed example like in 2.3.8 earlier in the document; there is a lot of detail presented early that doesn't seem necessary to understand the core ideas.

    Last comment: I suspect there is a connection to strictification[0], would love to know more if the authors see this!

    [0]: in the sense i mean here: https://arxiv.org/pdf/2201.11738v3

  • bgavran 3 days ago ago

    This is an interesting writeup, I wonder if the authors considered a categorical approach to representation of general applicative arrays (which might be tree-shaped), as described here (https://www.cs.ox.ac.uk/people/jeremy.gibbons/publications/a...) or here (https://github.com/bgavran/TensorType)

    • godelski a day ago ago

      FYI attention wasn't originally purposed on AIAYN. Their main contribution was a fully transformer based network.

      You could argue they didn't invent dot product attention nor transformers but they definitely formalized those so I'll leave that nitpicking to Schmidhuber lol. But the other stuff, they say just as much in the paper. It's easy to pass credit over to the ones who popularized a technique rather than the many people who developed it.

  • andersa 3 days ago ago

    Suggest watching this as an intro to what this is about! https://www.youtube.com/watch?v=ufa4pmBOBT8

  • jonathrg 3 days ago ago

    Needs to be capitalized as CuTe

  • carterschonwald 3 days ago ago

    This is neat. Reminds me that I am like a decade overdue to write up my own work in this part of array computation