Skip to main navigation Skip to search Skip to main content

Self-maintaining [networked] systems: the rise of datacenter robotics!

  • Freddie Hong
  • , Iason Sarantopoulos
  • , Elliott Hogg
  • , David Richardson
  • , Yizhong Zhang
  • , Hugh Williams
  • , David Sweeney
  • , Andromachi Chatzieleftheriou
  • , Antony Rowstron
  • Microsoft Research Cambridge

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The vision of self-maintaining systems is to make cloud hardware automatically servicing and repairing using robotics. We define a self-maintaining system as one where software can control robotics that can automatically perform hardware maintenance tasks and repair operations. This reduces failure service windows and lowers the risk of repairs causing further cascading failures and outages. Self-maintaining systems are not purely reactive to failures, but also do proactive maintenance before failures occur which reduces future hardware failures. Operating an entire datacenter as a self-maintaining system is many years away, and we present four stages of automation, analogous to levels used for autonomous vehicles, required to reach the full vision for datacenters. To experiment with and learn about self-maintaining systems we have focused on datacenter networking. We have created basic robots that support common network maintenance tasks, such as reseating and cleaning optical transceivers and replacing optical fiber cables. The advantages of self-maintaining networks are lower costs and increased availability and reliability. Key is a cross-layering co-design approach; the core cloud services are co-designed with the robotic systems performing the repairs and maintenance. The services control the robots, and this is very analogous to how Software Defined Networking has evolved for broader network management.

Original languageEnglish
Title of host publicationHOTNETS 2024 - Proceedings of the 2024 3rd ACM Workshop on Hot Topics in Networks
Place of PublicationNew York, U.S.
PublisherAssociation for Computing Machinery, Inc
Pages159-166
Number of pages8
ISBN (Electronic)9798400712722
DOIs
Publication statusPublished - 18 Nov 2024
Externally publishedYes
Event3rd ACM Workshop on Hot Topics in Networks, HOTNETS 2024 - Irvine, United States
Duration: 18 Nov 202419 Nov 2024

Publication series

NameHOTNETS 2024 - Proceedings of the 2024 3rd ACM Workshop on Hot Topics in Networks

Conference

Conference3rd ACM Workshop on Hot Topics in Networks, HOTNETS 2024
Country/TerritoryUnited States
CityIrvine
Period18/11/2419/11/24

Keywords

  • Automation
  • Networks
  • Self-repair

Fingerprint

Dive into the research topics of 'Self-maintaining [networked] systems: the rise of datacenter robotics!'. Together they form a unique fingerprint.

Cite this