The AI Book
    Facebook Twitter Instagram
    The AI BookThe AI Book
    • Home
    • Categories
      • AI Media Processing
      • AI Language processing (NLP)
      • AI Marketing
      • AI Business Applications
    • Guides
    • Contact
    Subscribe
    Facebook Twitter Instagram
    The AI Book
    Daily AI News

    Deepmind unveils RT-2, a new AI that makes robots smarter

    31 July 2023No Comments3 Mins Read

    [ad_1]

    Head over to our on-demand library to view sessions from VB Transform 2023. Register Here


    Google’s Deepmind has announced Robotics Transformer 2 (RT-2), a first-of-its-kind vision-language-action (VLA) model that can enable robots to perform novel tasks without specific training.

    Just like how language models learn general ideas and concepts from web-scale data, RT-2 uses text and images from the web to understand different real-world concepts and translate that knowledge into generalized instructions for robotic actions. 

    When improved, this technology can lead to context-aware, adaptable robots that could perform different tasks in different situations and environments — with far less training than currently required.

    What makes Deepmind’s RT-2 unique?

    Back in 2022, Deepmind debuted RT-1, a multi-task model that trained on 130,000 demonstrations and enabled Everyday Robots to perform 700-plus tasks with a 97% success rate. Now, using the robotic demonstration data from RT-1 with web datasets, the company has trained the successor of the model: RT-2.

    Event

    VB Transform 2023 On-Demand

    Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.

     

    Register Now

    The biggest highlight of RT-2 is that, unlike RT-1 and other models, it does not require hundreds of thousands of data points to get a robot to work. Organizations have long found specific robot training (covering every single object, environment and situation) critical to handling complex, abstract tasks in highly variable environments.

    However, in this case, RT-2 learns from a small amount of robotic data to perform the complex reasoning seen in foundation models and transfer the knowledge acquired to direct robotic actions – even for tasks it’s never seen or been trained to do before.

    “RT-2 shows improved generalization capabilities and semantic and visual understanding beyond the robotic data it was exposed to,” Google explains. This includes interpreting new commands and responding to user commands by performing rudimentary reasoning, such as reasoning about object categories or high-level descriptions.”

    Taking action without training

    According to Vincent Vanhoucke, head of robotics at Google DeepMind, training a robot to throw away trash previously meant explicitly training the robot to identify trash, as well as pick it up and throw it away.

    But with RT-2, which is trained on web data, there’s no need for that. The model already has a general idea of what trash is and can identify it without explicit training. It even has an idea of how to throw away the trash, even though it’s never been trained to take that action.

    When dealing with seen tasks in internal tests, RT-2 performed just as well as RT-1. However, for novel, unseen scenarios, its performance almost doubled performance to 62% from RT-1’s 32%.

    Potential applications

    When advanced, vision-language-action models like RT-2 can lead to context-aware robots that could reason, problem-solve and interpret information for performing a diverse range of actions in the real world depending on the situation at hand.

    For instance, instead of robots performing the same repeated actions in a warehouse, enterprises could see machines that could handle each object differently, considering factors like the object’s type, weight, fragility and other factors.

    According to Markets and Markets, the segment of AI-driven robotics is expected to grow from $6.9 billion in 2021 to $35.3 billion in 2026, an expected CAGR of 38.6%.

    VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

    [ad_2]

    Source link

    Previous ArticleMIT CSAIL unveils PhotoGuard, an AI defense against unauthorized image manipulation
    Next Article IBM study reveals how AI, automation protect enterprises against data breaches
    The AI Book

    Related Posts

    Daily AI News

    Adobe Previews New GenAI Tools for Video Workflows

    16 April 2024
    Daily AI News

    Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

    15 April 2024
    Daily AI News

    8 Reasons to Make the Switch

    15 April 2024
    Add A Comment

    Leave A Reply Cancel Reply

    • Privacy Policy
    • Terms and Conditions
    • About Us
    • Contact Form
    © 2025 The AI Book.

    Type above and press Enter to search. Press Esc to cancel.