Team behind Silica crabs tackles networking maintenance machines
Microsoft Research has given a glimpse at some of its prototype data center robotics efforts.
In a paper for Hotnets, the team said that “this marks the beginning of a fundamental shift in how we conceive and design data center hardware and the software services,” but noted that the effort was “at the very start of this journey.”
Key to the robotics project, Microsoft said, was not to try to recreate humans in robotic form. “We simply do not believe that the humanoid form factor or a hand-inspired gripper is suitable for most tasks in the data center,” the paper states.
Instead, the company believes that it makes sense to make multiple “advanced modular” robots that are designed for highly-specific tasks. Those tasks should ultimately allow the data center to clean and repair itself.
“We propose the concept of self-maintaining systems,” the paper states. “A self-maintaining system is one that can manage and control its own hardware repair and maintenance. This is enabled through advanced robotics and automation. It offers the potential for fine-grained control of repairs, not only reducing the time window for a repair, but also helping manage the impact of cascading failures and false positives on repairs.
“An additional advantage is that currently very little data center hardware is proactively serviced, it is usually accessed only when it fails. This is due to scale (and therefore costs) and the issue of cascading failures. We believe dextrous advanced robotics design specifically to operate in the data center can also make proactive maintenance feasible, and thereby reduce the number of hardware failures.”
Microsoft Research detailed two such maintenance robots, both of which are still prototypes.
The first is a transceiver manipulation robot which includes a manipulator arm and gripper that can grip and manipulate a single transceiver “while minimizing accidental interaction with physically close cables.”
The gripper can be inserted in between optical cables and then “gently to move them apart, while still being able to grip the transceiver pull tab.” The robot uses a vision system to understand the complex environment “and enable it to autonomously navigate through cluttered cabling to the target port to reseat, plug or unplug the transceiver.”
Humans, on the other hand, can cause transient packet loss when accidentally touching cables. “We refer to this phenomenon as simply cascading failures. Cascading failures occur when physical motion near or with hardware creates vibrations and other physical effects on the co-located hardware, which leads to additional transient (or permanent!) failures.”
Next is a fiber and transceiver cleaning robot. When a transceiver with an attached fiber cable is plugged into the unit by a technician or the transceiver manipulation robot, this robot “automatically detaches the cable from the transceiver, visually inspects the fiber end-face cores and the transceiver and then cleans any parts needed to pass inspection, before reassembling.”
It features many actuators, “and the device is complex and dextrous,” Microsoft said. The transceiver and cable diversity found in a large-scale global cloud provider is a challenge, the company admitted, so the unit uses cameras and recognition models to determine the type and size of the transceiver and cable.
A display mounted on the robot allows a human to monitor and observe progress, as well as see the inspected images. The cleaning robot is modular and can be integrated with the transceiver manipulation robot, or be used as a standalone system.
This entire operation of both robots working in tandem currently takes a few minutes, but that could be optimized, Microsoft said. “Already, the end-face inspection for eight cores takes less than 30 seconds which is less time than a well-trained human.”
The company is now “focusing on developing a set of small-scale robotic units that minimize the variety of robot form factors needed while supporting a diverse range of operations, and this set of robotic units includes mobility units” to operate at larger-scale beyond the single rack.
Many of the researchers noted in the paper – including Andromachi Chatzieleftheriou, Elliott Hogg, and Antony Rowstron – were also involved in the crab-like robots used for Project Silica, Microsoft’s 10,000-year storage system we profiled earlier this year. Those systems could move between racks as well as up and down.
“Recent advancements in robotics suggest that the enabling technology is nearly within reach,” the researchers say. “To realize this vision it will require a highly interdisciplinary approach, with researchers in robotics and automation collaborating closely with experts in networking, systems, and machine learning.”
They add: “A critical challenge in developing self-maintaining systems lies in advancing the current state of vision technology. Our work aims to address these obstacles by developing advanced perception systems capable of operating in environments characterized by complex wiring looms and significant occlusions.”
The team split data center robotics progress into four levels (along with Level 0, no robotics). Level 1 uses automated devices to augment human operators, 2 features partial automation where the robots carry out specialized tasks with human supervision.
At 3 we reach high automation; end-to-end tasks with limited human supervision. And with 4, we have full automation, where every data center repair operation is fully autonomous without the need for human supervision.
“We are currently exploring robotics that operate between Levels 2 and 3,” the researchers said.
“Level 3 will eventually enable self-maintaining systems, while Level 4 will enable potentially fully self-maintaining data centers, not designed around humans but optimized for high density and energy efficiency instead.
“At Level 4 humans can provide oversight at the data center site but without needing to be physically present in the data center halls. Even achieving the basic robotics to support Levels 1 and 2 is challenging and this is our current focus. However, we believe that once Level 2 is achieved, getting to Levels 3 and 4 will be easier.”
Last year, DCD exclusively reported on Microsoft’s dedicated data center robotics team, which eventually aims to create ‘zero touch’ data centers.
https://www.datacenterdynamics.com/en/news/microsoft-research-details-early-stage-modular-robotics-for-data-centers/