Google How we built the new family of Gemini Robotics models : US Pioneer Global VC DIFCHQ SFO Singapore – Riyadh Swiss Our Mind

Powered by Gemini Robotics models, robots can learn complex actions like preparing salads, playing games like Tic-Tac-Toe and even folding an origami fox.

A GIF shows a black robot arm picking up a small orange ball and placing it into a miniature toy basketball hoop. A prompt, Pick up the basketball and slam dunk it, is written out at the bottom of the GIF.

Carolina says witnessing the slam dunk was a “wow” moment.

A humanoid robot stands opposite a person looking at their laptop. In white text, the words Gemini 2.0 + Robotics are overlaid.

3:00

This is a collage of visualizations showcasing these capabilities. Top left: 2D object detection, top right: pointing, bottom left: multi-view correspondence, bottom right: 3d object detection.

Gemini Robotics-ER excels at embodied reasoning capabilities, including detecting objects and pointing at object parts, finding corresponding points and detecting objects in 3D.

Four images of robots performing actions. In the top left, a humanoid robot is packing a lunch, in the top right a small arm can be seen picking up a snap pea from a tupperware container, in the bottom left two large white arms ready for a task on a bench, and in the bottom right a black pincer hand holds a whiteboard eraser atop a whiteboard.

The models adapt to different embodiments, able to perform tasks like packing a lunchbox or wiping a whiteboard in different forms.