Google DeepMind Integrates Gemini 1.5 Pro into Robots for Real-World Navigation

July 12, 2024July 12, 2024 FNN 1 Comment tech, technews, Technology, technologynews

DeepMind said that the Gemini-integrated robots were able to perform Multimodal Instruction Navigation

Google DeepMind has announced exciting advancements in robotics, showcasing how its Gemini 1.5 Pro AI is enhancing robots’ ability to navigate real-world environments. In a recent demonstration, a robot was able to guide users to specific locations, like a whiteboard, based on contextual queries.

Key Features of Gemini 1.5 Pro

DeepMind’s Gemini 1.5 Pro utilizes a long context window of 2 million tokens, allowing robots to process extensive information about their surroundings. This capability helps the robots understand and remember details about various environments, enabling them to assist users effectively.

For example, when asked about “popular ice cream flavors,” a traditional AI might list the flavors but with a larger context window, it can analyze trends and popularity based on data from various sources.

How It Works

Using this advanced context, DeepMind trains its robots to follow human instructions and apply common sense reasoning. The robots can utilize video tours and other inputs to navigate spaces more intelligently. With Gemini 1.5 Pro, robots can now understand vague requests and provide relevant assistance.

Collaborative Technology

Alongside Gemini, DeepMind employs its Robotic Transformer 2 (RT-2) model, which combines vision, language, and action. This model learns from both online data and real-world robotics, helping the AI understand and interact with its environment more effectively.

DeepMind’s innovations in AI and robotics highlight a significant step towards creating smarter, more capable machines that can assist humans in everyday tasks and complex environments.

Follow for more information.

Share This Post