
Distributed storage systems for large-scale applications typically use replication for reliability. Recently, erasure codes were used to reduce the large storage overhead, while increasing data reliability. A main limitation of off-the-shelf erasure codes is their high-repair cost during single node failure events. A major open problem in this area has been the design of codes that {\it i)} are repair efficient and {\it ii)} achieve arbitrarily high data rates. In this paper, we explore the repair metric of {\it locality}, which corresponds to the number of disk accesses required during a {\color{black}single} node repair. Under this metric we characterize an information theoretic trade-off that binds together locality, code distance, and the storage capacity of each node. We show the existence of optimal {\it locally repairable codes} (LRCs) that achieve this trade-off. The achievability proof uses a locality aware flow-graph gadget which leads to a randomized code construction. Finally, we present an optimal and explicit LRC that achieves arbitrarily high data-rates. Our locality optimal construction is based on simple combinations of Reed-Solomon blocks.
presented at ISIT 2012, accepted for publication in IEEE Trans. IT, 2014
Computer Science - Networking and Internet Architecture, Networking and Internet Architecture (cs.NI), FOS: Computer and information sciences, Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Information Theory, Information Theory (cs.IT), Distributed, Parallel, and Cluster Computing (cs.DC)
Computer Science - Networking and Internet Architecture, Networking and Internet Architecture (cs.NI), FOS: Computer and information sciences, Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Information Theory, Information Theory (cs.IT), Distributed, Parallel, and Cluster Computing (cs.DC)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 376 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 1% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 0.1% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 1% |
