Google DeepMind Integrates Gemini 1.5 Pro into Robots for Real-World Navigation
Google DeepMind has announced exciting advancements in robotics, showcasing how its Gemini 1.5 Pro AI is enhancing robots’ ability to navigate real-world environments. In a recent demonstration, a robot was able to guide users to specific locations, like a whiteboard, based on contextual queries.
Key Features of Gemini 1.5 Pro
DeepMind’s Gemini 1.5 Pro utilizes a long context window of 2 million tokens, allowing robots to process extensive information about their surroundings. This capability helps the robots understand and remember details about various environments, enabling them to assist users effectively.
For example, when asked about “popular ice cream flavors,” a traditional AI might list the flavors but with a larger context window, it can analyze trends and popularity based on data from various sources.
How It Works
Using this advanced context, DeepMind trains its robots to follow human instructions and apply common sense reasoning. The robots can utilize video tours and other inputs to navigate spaces more intelligently. With Gemini 1.5 Pro, robots can now understand vague requests and provide relevant assistance.
Collaborative Technology
Alongside Gemini, DeepMind employs its Robotic Transformer 2 (RT-2) model, which combines vision, language, and action. This model learns from both online data and real-world robotics, helping the AI understand and interact with its environment more effectively.
DeepMind’s innovations in AI and robotics highlight a significant step towards creating smarter, more capable machines that can assist humans in everyday tasks and complex environments.
Follow for more information.
Pingback: Unlock Direct Chat with Gemini in Google Chrome: Here's How in 3 Exciting Steps - fnnnews