• AutoTL;DR@lemmings.worldB
    link
    fedilink
    English
    arrow-up
    0
    ·
    11 months ago

    This is the best summary I could come up with:


    The six-minute video shows off Gemini’s multimodal capabilities (spoken conversational prompts combined with image recognition, for example).

    Gemini seemingly recognizes images quickly — even for connect-the-dots pictures — responds within seconds, and tracks a wad of paper in a cup and ball game in real-time.

    “That’s quite different from what Google seemed to be suggesting: that a person could have a smooth voice conversation with Gemini as it watched and responded in real-time to the world around it,” Olson writes.

    In a situation like this, Olson says Google is “showboating” in order to mislead people from the fact Gemini still lags behind OpenAI’s GPT.

    When asked about the validity of the demo, it pointed The Verge to a post from Oriol Vinyals, vice president of research and deep learning lead at Google’s DeepMind (also the co-lead for Gemini), which explains how the team made the video.

    That’s certainly one way to approach this situation, but it might not be the right one for Google — which has already appeared, at least to the public eye, to have been caught flat-footed by OpenAI’s enormous success this year.


    The original article contains 546 words, the summary contains 186 words. Saved 66%. I’m a bot and I’m open source!