Skip to content

Google DeepMind Unveils Gemini Robotics-ER 1.5: A Leap in Robotic Reasoning

Gemini Robotics-ER 1.5 splits intelligence into high-level reasoning and low-level control, enabling robots to tackle complex, real-world tasks more effectively.

There is a poster in which there is a robot, there are animated persons who are operating the...
There is a poster in which there is a robot, there are animated persons who are operating the robot, there are artificial birds flying in the air, there are planets, there is ground, there are stars in the sky, there is watermark, there are numbers and texts.

Google DeepMind Unveils Gemini Robotics-ER 1.5: A Leap in Robotic Reasoning

Google DeepMind has unveiled Gemini Robotics-ER 1.5, a significant advancement in embodied reasoning for robots. This new system showcases improvements in instruction following, action generalization, and more.

Gemini Robotics-ER 1.5 targets long-horizon, real-world tasks and introduces motion transfer. This allows skills learned on one platform to transfer to another, reducing data collection and narrowing the gap between simulation and reality.

The system splits embodied intelligence into two models: one for high-level reasoning and another for low-level visuomotor control. Gemini Robotics-ER 1.5, the high-level model, is a multimodal planner that ingests images and video, tracks progress, and invokes external tools before issuing sub-goals.

DeepMind has introduced layered controls and expanded evaluation suites to catch hallucinated affordances or nonexistent objects before actuation. Earlier end-to-end visual-language-action models struggled with planning and generalization, leading to the modular design of Gemini Robotics-ER 1.5.

Enabling VLA thought traces increases long-horizon task completion and stabilizes mid-rollout plan revisions. Pairing Gemini Robotics-ER 1.5 with the VLA agent improves progress on multi-step tasks compared to a baseline orchestrator.

Gemini Robotics-ER 1.5, developed by Google DeepMind, is a significant step forward in robotic embodied reasoning. Its ability to generalize across tasks and platforms, combined with its robust planning capabilities, promises to advance the field of robotics.

Read also:

Latest