Deep researcher with test-time diffusion

(research.google)

85 points | by simonpure 4 days ago ago

13 comments

vessenes 11 hours ago ago
OK, I like this. It’s an agent-based add on to (for now) Gemini that aims at improving the quality of output through a more ‘human’ style of research - digging deeper, considering counter examples, fleshing out with more research thin areas.
I’d like to try it, but I just learned I need and Enterprise Agentic subscription of some sort from Google; no idea how much that costs.
That said, this seems like a real abuse of the term diffusion, as far as I can tell. I don’t think this thing is reversing any entropy on any latent space.
[-]
- CuriouslyC 10 hours ago ago
  They published a paper, and this isn't something complex that would take a lot of work to implement. You could probably give codex an example open source deep research project, then sic it on the paper and tell it to make a fork that uses this algorithm, I wouldn't be surprised if it could basically one shot implement.
  [-]
  - vessenes 10 hours ago ago
    Yeah good idea. Virtual Lucid Rains could reimplement.
mentalgear 12 hours ago ago
Interesting research, but I wish people would stick to the clearer term “inference-time computation” instead of the more ambiguous and confusing “test-time computation.”
[-]
- bonoboTP 9 hours ago ago
  Test/evaluation/inference are treated as almost synonymous because in academic research you almost exclusively run inference on a trained model in order to evaluate its performance on a test set. Of course in the real world, you will want to run inference in production to do useful work. But the language comes from research.
- adastra22 12 hours ago ago
  Literally everything you do during inference is inference-time, no?
  [-]
  - falcor84 10 hours ago ago
    Well, if all you're doing is accessing stuff that was pre-learned earlier, then it's not quite inference-time.
badbart14 9 hours ago ago
Huh never thought of the process of drafting while writing to be similar to how diffusion models start with a noisy set. Super cool for sure though I'm curious if this (and other similar research on making models think more at inference time) are showing that the best way for models to "think" is the exact same way humans do
xnx 5 hours ago ago
Does this share techniques with Gemini Diffusion? https://blog.google/technology/google-deepmind/gemini-diffus...
[-]
- Fripplebubby 4 hours ago ago
  The way I read the paper, "diffusion" was more of a metaphor - you start with the output of the LLM as the overview (very much _not_ random noise), and then refine it over many steps. However, seeing this, I wonder myself whether or not in-house they actually mean it more literally or have actually tried using it more literally.
blixt 8 hours ago ago
They reference a paper using initial noisy data as a key, mapping to a "jump-ahead" value of a previous example. I think this is very cool and clever, and does use a diffusion model.
But I don't see how this Deep Researcher actually uses diffusion at all. So it seems wrong to say "test-time diffusion" just because you liken an early text draft with noise in a diffusion model, then use RAG to retrieve a potential polished version of said text draft?
daxfohl 7 hours ago ago
Seems like a useful approach to coding assistants as well. Write some draft functionality, notice some patterns or redundancy with the existing code or in the change itself, search for libraries or alternative design patterns that could help out or create something that is targeted to the use case, reimplement in terms of those new components.
esafak 9 hours ago ago
The first time I'm hearing about their https://cloud.google.com/products/agentspace