8 comments

  • Scaevolus 5 hours ago ago

    Client-side frame extraction is far too slow to be usable for large volumes of data.

    You want to precompute the contact sheets and serve them to users. You can encode them with VP9, mux to IVF format, and use the WebCodec API to decode them in the browser (2000B-3000B per 240x135 frame, so ~3MB/hour for a thumbnail every 4 seconds). Alternatively, you can make the contact sheets with JPEG, but there are dimension restrictions, reflow is slightly fiddly, and it doesn't exploit intra-frame compression.

    I made a simple Python/Flask utility for lossless cutting that uses this to present a giant contact sheet to quickly select portions of a video to extract.

    • haasiy 16 minutes ago ago

      Actually, I started with the precomputing approach you mentioned. But I realized that for many users, setting up a backend to process videos or managing pre-generated assets is a huge barrier.

      I purposely pivoted to 100% client-side extraction to achieve zero server load and a one-line integration. While it has limits with massive data, the 'plug-and-play' nature is the core value of VAM-Seek. I'd rather give people a tool they can use in 5 seconds than a high-performance system that requires 5 minutes of server config.

  • fc417fc802 5 hours ago ago

    > All frame extraction happens client-side via canvas – no server processing, no pre-generated thumbnails.

    Doesn't that mean the client has to grab a bunch of extra data when it first opens the page, at least if the user calls up the seek feature? Since you effectively have to grab various frames from all throughout the video to generate the initial batch. It seems like it would make more sense to have server side thumbnails here as long as they're reasonably sparse and low quality.

    Although I admit that one line client side integration is quite compelling.

    • haasiy 12 minutes ago ago

      Exactly. I view this cache similarly to how a browser (or Google Image Search) caches thumbnails locally. Since I'm only storing small Canvas elements, the memory footprint is much smaller than the video itself. To keep it sustainable, I'm planning to implement a trigger to clear the cache whenever the video source changes, ensuring the client's memory stays fresh.

  • dotancohen 5 hours ago ago

    This looks absolutely terrific if it is performative. How long does this library take to generate the thumbnails and the seek bar for e.g. a 60 minute video, on 8-year-old desktop hardware? Or on older mobile devices? For reference, my current desktop is from 2012.

    • haasiy 11 minutes ago ago

      Love the setup! A 2012 machine is a classic.

      To answer your question: VAM-Seek doesn't pre-render the entire 60 minutes. It only extracts frames for the visible grid (e.g., 24-48 thumbnails) using the browser's hardware acceleration via Canvas.

      On older hardware, the bottleneck is usually the browser's video seeking speed, not the generation itself. Even on a 2012 desktop, it should populate the grid in a few seconds. If it takes longer... well, that might be your PC's way of asking for a retirement plan! ;)

  • littlestymaar 7 hours ago ago

    The idea is very compelling, it solves a real use-case. I will definitely take inspiration from that.

    However, the execution is meh. The UX is terrible (on mobile at least) and the code and documentation are an overly verbose mess. The entire project ought to fit in the size of the AI generated readme. Using AI for exploration and prototyping is fine, but you can't ship that slop mate, you need to do the polishing yourself.

    • haasiy 18 minutes ago ago

      I intentionally used AI to draft the README so it's optimized for other AI tools to consume. My priority wasn't 'polishing' for human aesthetics, but rather hitting the 15KB limit and ensuring 100% client-side execution. I'd rather spend my time shipping the next feature than formatting text.