Sergey Yekhanin

Local erasure coding for data storage


 Historically, most large distributed storage systems (e.g.,
Hotmail) have been using replication to provide reliability against
machine failures. Today however as the amount of stored data reaches
multiple Exabytes keeping few copies of data around is becoming
prohibitively expensive. Therefore more and more systems are adopting
erasure coding in place of replication.

Local Reconstruction Codes (LRCs) are a new class of erasure
correcting codes designed specifically for applications in data
storage. Built upon the rich mathematical theory of locally decodable
codes developed in the theory community, LRCs provide high level of
reliability and allow data fragments to be reconstructed quickly in
typical failure scenarios. LRCs have been recently deployed by Windows
Azure Storage and are going to ship in Windows 8.1 and Windows Server

In this talk we will discuss motivation behind local reconstruction
codes and cover the main technical challenges and tradeoffs in the
design of these codes.

(Based on joint papers with Brad Calder, Michael Forbes, Parikshit
Gopalan, Cheng Huang, Bob Jenkins, Jin Li, Aaron Ogus, Huseyin
Simitci, and Yikang Xu.)