Google (GOOGL, Financial) has introduced the Gemini Robotics On-Device, a new AI language model designed to operate offline on robotic devices. This model builds on the previously released Gemini Robotics cloud version, enhancing robots' control and responsiveness. Developers can use natural language prompts to fine-tune this model, improving human-machine interaction efficiency across various applications.
In internal tests, Gemini Robotics On-Device performs comparably to its cloud counterpart, excelling in various standard benchmarks compared to other local AI language models. Demonstrations show robots equipped with this model completing tasks like unzipping a backpack and folding clothes.
Initially designed for ALOHA robots, the model has been adapted for use with the Franka FR3 dual-arm robot and Apptronik's Apollo humanoid robot. The Franka FR3, in particular, showcases adaptability in unfamiliar environments, including assembling tasks on industrial conveyor belts.
To support developers, Google DeepMind launched the Gemini Robotics SDK. Developers can train robots in new tasks using 50-100 demonstration operations in the MuJoCo simulator, accelerating training and deployment.
Other tech giants like NVIDIA (NVDA) and Hugging Face are also integrating AI models with robotics. NVIDIA is developing a foundational model platform for humanoid robots, while Hugging Face is working on open-source language models and datasets for robotics development.