The AI Book
    Facebook Twitter Instagram
    The AI BookThe AI Book
    • Home
    • Categories
      • AI Media Processing
      • AI Language processing (NLP)
      • AI Marketing
      • AI Business Applications
    • Guides
    • Contact
    Subscribe
    Facebook Twitter Instagram
    The AI Book
    Daily AI News

    Google’s Flagship Gemini AI Model Gets a Major Upgrade

    15 February 2024No Comments4 Mins Read

    [ad_1]

    Alphabet’s Gemini AI model has been public for only two months, but the company is already releasing an upgrade. Gemini Pro 1.5, launching with limited availability today, is more powerful than its predecessor and can handle huge amounts of text, video, or audio input at a time.

    Demis Hassabis, CEO of Google DeepMind, which developed the new model, compares its vast capacity for input to a person’s working memory, something he explored years ago as a neuroscientist. “The great thing about these core capabilities is that they unlock sort of ancillary things that the model can do,” he says.

    In a demo, Google DeepMind showed Gemini Pro 1.5 analyzing a 402-page PDF of the Apollo 11 communications transcript. The model was asked to find humorous portions and highlighted several moments, like when astronauts said that a communications delay was due to a sandwich break. Another demo showed the model answering questions about specific actions in a Buster Keaton movie. The previous version of Gemini could have answered these questions only for much shorter amounts of text or video. Google hopes that the new capabilities will allow developers to build new kinds of apps on top of the model.

    “It really feels quite magical how the model performs this sort of reasoning across every single page, every single word,” says Oriol Vinyals, a research scientist at Google DeepMind.

    Google says Gemini Pro 1.5 can ingest and make sense of an hour of video, 11 hours of audio, 700,000 words, or 30,000 lines of code at once—several times more than other AI models, including OpenAI’s GPT-4, which powers ChatGPT. The company has not disclosed the technical details behind this feat. Hassabis says that one use for models that can handle large amounts of text, tested by researchers at Google DeepMind, is identifying the important takeaways in Discord discussions with thousands of messages.

    Gemini Pro 1.5 is also more capable—at least for its size—as measured by the model’s score on several popular benchmarks. The new model exploits a technique previously invented by Google researchers to squeeze out more performance without requiring more computing power. The technique, called mixture of experts, selectively activates parts of a model’s architecture that are best suited to solving a given task, making it more efficient to train and run.

    Google says that Gemini Pro 1.5 is as capable as its most powerful offering, Gemini Ultra, in many tasks, despite being a significantly smaller model. Hassabis says there is no reason why the same technique used to improve Gemini Pro cannot be applied to boost Gemini Ultra.

    The upgraded version of Gemini Pro will be made available to developers through AI Studio, a sandbox for testing model capabilities, and to a limited number of developers though Google’s Vertex AI cloud platform API. There’s no date yet for a general release.

    Google is also launching new tools to help developers use Gemini in their applications, including new ways of tapping into the models’ ability to parse video and audio. The company also said it is adding new Gemini-powered features to its web-based coding tool, Project IDX, including ways for AI to debug and test code.

    The speed of Gemini’s upgrade is a sign of a furious AI race kicked off by the success of ChatGPT. Earlier this week, OpenAI announced that it is giving ChatGPT the ability to remember useful information from conversations over long periods of time. Last week, Google rebranded its chatbot Bard and announced that Gemini Ultra would be available with a paid subscription.

    The frenetic pace of progress in generative AI is at odds with worries about the risks the technology might pose. Google says it has put Gemini Pro 1.5 through extensive testing and that providing limited access offers a way to gather feedback on potential risks. The company says it has also provided researchers at the UK’s AI Safety Institute with access to its most powerful models so that they can test them.

    Hassabis says to expect more advances in the months to come. “This is a new cadence,” he says, “I’m trying to bring from a sort of startup mentality.”

    [ad_2]

    Source link

    Previous ArticleThere is more to chat than just Q&A as Vectara debuts new RAG powered chat module
    Next Article Some People Actually Kind of Love Deepfakes
    The AI Book

    Related Posts

    Daily AI News

    Adobe Previews New GenAI Tools for Video Workflows

    16 April 2024
    Daily AI News

    Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

    15 April 2024
    Daily AI News

    8 Reasons to Make the Switch

    15 April 2024
    Add A Comment

    Leave A Reply Cancel Reply

    • Privacy Policy
    • Terms and Conditions
    • About Us
    • Contact Form
    © 2025 The AI Book.

    Type above and press Enter to search. Press Esc to cancel.