Real-time action chunking with large models

(pi.website)

84 points | by pr337h4m 6 months ago ago

9 comments

  • fennecbutt 6 months ago ago

    Alright, I'm building the robot project I was putting off. This is so fucking cool.

    Excellent work!

  • jauntywundrkind 6 months ago ago

    Anyone have good intro recommendations for VLAs?

  • UltraSane 6 months ago ago

    I love the implications of a robot that can plug in Ethernet cables.

    • lysp 6 months ago ago

      Just need one that can plug in USB-A cables the first attempt (I average 3 attempts).

    • meepmorp 6 months ago ago

      “Soon, a robot will fix the cables in the server room for me!”

      • LoganDark 6 months ago ago

        New job title: Spaghetti Organizer

  • b0a04gl 6 months ago ago

    rtc handling 300ms+ delay and still pulling off tasks like plugging ethernet is kinda nuts. what i'm not getting is but how's it keeping the control loop stable without retraining? some sort of latent plan caching?

    • kvablack 6 months ago ago

      It uses an inpainting algorithm (adapted from image generation literature) to produce future actions that are consistent with the current trajectory. It's sort of like warm-starting from a cached plan, although the plan isn't latent, it's directly in action space. Hopefully that answers your question -- there are many more details in the blog post and paper :)