Friday, September 26, 2025

Trending

Related Posts

Google DeepMind Launches Gemini Robotics 1.5

Google DeepMind has introduced Gemini Robotics 1.5, a groundbreaking family of AI models designed to bring advanced reasoning and action capabilities to physical robots, marking a significant step toward the “agentic era” of robotics. Announced on September 25, 2025, the suite includes the embodied reasoning model Gemini Robotics-ER 1.5 and the vision-language-action (VLA) model Gemini Robotics 1.5, both built on Gemini 2.0 with enhancements for physical environments. For robotics developers, AI researchers, and tech enthusiasts searching Google Gemini Robotics 1.5 launch, embodied AI models 2025, or Google DeepMind robot reasoning, these models excel in planning complex tasks, adapting to diverse robot forms like bi-arm platforms and humanoids, and performing real-world activities such as cleaning tables or folding origami. Now available to developers via the Gemini API in Google AI Studio, Gemini Robotics-ER 1.5 acts as a “high-level brain” for robots, orchestrating actions with natural language understanding and tool integration.

This release builds on Gemini 2.0’s multimodal foundation, adding physical action as an output modality to enable seamless interaction with the real world.

Key Models: Embodied Reasoning and Vision-Language-Action

The Gemini Robotics family comprises two complementary models, each addressing different aspects of robotic intelligence.

  • Gemini Robotics-ER 1.5 (Embodied Reasoning): This model serves as the robot’s “high-level brain,” processing natural language commands, reasoning through long-horizon tasks, and orchestrating behaviors. It excels at breaking down complex requests (e.g., “clean up the table”) into step-by-step plans, estimating success probabilities, and calling tools like Google Search or custom functions. With state-of-the-art spatial understanding, it adapts to novel situations and diverse environments.
  • Gemini Robotics 1.5 (Vision-Language-Action): Focused on execution, this VLA model generates robot actions using visual and text inputs. It “thinks” through each step, considering intuitive approaches to manipulation, and supports fine motor skills like grasping or coordination. Optimized for on-device operation, it runs locally on robots without internet dependency, accelerating adaptation to new tasks.

Both models generalize across robot embodiments—from static bi-arm setups like ALOHA to humanoids like Apptronik’s Apollo—reducing the need for hardware-specific training.

ModelCore FunctionKey CapabilitiesAvailability
Gemini Robotics-ER 1.5High-Level PlanningNatural Language Processing, Tool Calling, Success EstimationGemini API (Google AI Studio)
Gemini Robotics 1.5Action GenerationVisual Guidance, Fine Motor Control, On-Device OptimizationSelect Partners; SDK for Adaptation

Capabilities and Benchmarks: State-of-the-Art Performance

Gemini Robotics 1.5 outperforms previous models in generalization, handling unseen tasks, objects, and environments. It achieves top marks on academic benchmarks for multi-step reasoning and internal tests for real-world adaptability.

  • Multi-Step Tasks: Breaks down instructions like “sort laundry by color” into plans, using vision to guide actions.
  • Cross-Embodiment Learning: A single model works across robots, speeding up development.
  • Safety Focus: Includes safeguards via collaborations with experts and the Responsibility and Safety Council.

DeepMind researchers demonstrated it on tasks requiring dexterity, such as industrial assembly or origami folding, showcasing intuitive problem-solving.

Partnerships and Developer Access

Google is partnering with Apptronik to integrate Gemini Robotics into next-generation humanoids, enabling fine motor skills and coordination. Developers can access Gemini Robotics-ER 1.5 via the Gemini API today, with the VLA model available to select partners and an SDK for customization.

  • On-Device Model: Runs locally for real-time performance.
  • SDK Release: Tools for fine-tuning on custom tasks and environments.

DeepMind’s Carolina Parada noted: “We’re powering physical agents to perceive, plan, think, and act in complex scenarios.”

Conclusion: Gemini Robotics 1.5 Ushers in the Agentic Era

Google DeepMind’s Gemini Robotics 1.5 launch is a leap for embodied AI, blending reasoning and action to make robots more versatile and intuitive. From planning to execution, it accelerates the shift to physical agents. For developers, the API beckons—will it transform robotics as Gemini did chat? The hardware hums with possibility.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Popular Articles