Google’s latest innovation in robotics, Robotics Transformer 2 (RT-2), is a groundbreaking vision-language-action (VLA) model that brings us closer to a future of helpful robots.

RT-2, a Transformer-based model, has been trained on text and images from the web, allowing it to directly output robotic actions, effectively making it capable of a “speaking robot.”

Developing robots that can handle complex, abstract tasks in diverse and unfamiliar environments has been a challenging endeavor. Unlike chatbots, robots require real-world grounding and understanding of their capabilities.

Traditionally, this meant training robots on billions of data points, which was time-consuming and impractical for most innovators.

RT-2 takes a new approach to the problem. It improves robots’ reasoning abilities and eliminates the need for complex stacks of systems by enabling a single model to perform both complex reasoning and robot actions.

Even with a small amount of robot training data, RT-2 can transfer knowledge from its language and vision training data to direct robot actions, even for tasks it has never been explicitly trained to do.

The benefits of RT-2 are significant. It allows robots to rapidly adapt to novel situations and environments, performing as well as previous models on tasks in their training data and significantly outperforming them on unseen scenarios.

Moreover, RT-2’s ability to transfer learned concepts to new situations brings robots closer to learning and adapting more like humans.

This advancement not only signifies the convergence of AI and robotics but also holds immense promise for the development of more general-purpose robots that can better serve human-centered environments.

While there is still much work to be done to fully realize the potential of helpful robots, RT-2 provides a glimpse of an exciting future for robotics—one where robots can learn from diverse data sources and tackle a wide array of tasks, bringing us closer to a world of advanced and capable robotic assistants.

