Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Recolector de Cienci...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
DBLP
Doctoral thesis
Data sources: DBLP
versions View all 4 versions
addClaim

Coding techniques for distributed storage

Authors: Gastón Brasó, Bernat;

Coding techniques for distributed storage

Abstract

Online data storage is often regarded as a growing business, yet many unresolved issues linger in this specific field and prevent researchers from driving it to full capacity. Data replication (most commonly known as backup) is simply not efficient when improving persistence and accessibility of such data. Error correcting codes are known for their efficiency when adding redundancy to avoid lose of information. Unfortunately, the use of error correcting codes entail additional problems such as the repair problem: how do we replace a storage node downloading as less data as possible from other nodes. In this dissertation, we deepen on state-of-the-art of codes applied to distributed storage systems. Additionally, a family of regenerative codes which we call quasi-cyclic flexible regenerating codes is provided. Quasi-cyclic flexible minimum storage regenerating (QCFMSR) codes are constructed and their existence is well-proven. Quasi-cyclic flexible regenerating codes with minimum bandwidth constructed from a base QCFMSR code are also provided. Quasi-cyclic flexible regenerating codes are very interesting because of their simplicity and low complexity. They allow exact repair-by-transfer in the minimum bandwidth case and an exact pseudo repair-by-transfer in the MSR case, where operations are needed only when a new node enters into the system replacing a lost one. Finally, we propose a new model whereby storage nodes are placed in two racks. This unprecedented two-rack model is generalized to any number of racks. In this specific set-up, storage nodes have different repair costs depending on the rack where they are placed. A threshold function, which minimizes the amount of stored data per node and bandwidth needed to regenerate a failed node, is also shown. This latter threshold function generalizes those given by previous distributed storage models. Tradeoff curves obtained from this threshold function are compared with those obtained from previous models, and it is shown that this new model outperforms previous ones in terms of repair cost.

Encara que l'emmagatzematge online d'informació és un negoci creixent, no està exempt de problemàtiques, una d'elles és la persistència i accessibilitat de les dades. Cal replicar les dades de manera que si es perd una còpia no es perdi la informació de forma definitiva. Malauradament, la replicació de dades (coneguda com a ``backup'') no és una solució eficient, ja que introdueix molta redundància que provoca sobre costos. Els codis correctors d'errors són coneguts per augmentar la persistència i l'accessibilitat de les dades minimitzant la redundància necessària. Però el seu us introdueix altres problemes com l'anomenat ``repair problem'': com substituir un node d'emmagatzematge descarregant el mínim de dades dels altres nodes. En aquesta dissertació, estudiem l'estat de l'art pel que fa als codis aplicats a sistemes d'emmagatzematge distribuïts, com per exemple el ``cloud storage''. També ens introduïm al ``repair problem'' des de la vessant més aplicada, usant topologies de sistemes reals com els ``data centers''. Concretament, aportem una família de codis regeneratius que anomenem quasi-cyclic flexible regenerating codes i que es caracteritza per minimitzar l'ús de recursos computacionals en el procés de regeneració d'un node. Alhora, aquesta solució minimitza les dades emmagatzemades i l'ample de banda necessari per regenerar un node que falla. També estudiem el cas en que els costos de descàrrega de les dades no són homogenis. En concret, ens centrem en el cas dels racks, on els nodes d'emmagatzematge estan distribuïts en racks, i el cost de descàrrega de dades dels nodes en el mateix rack és molt menor que el cost de descàrrega de dades dels nodes en un altre rack. Aquest nou model generalitza els models teòrics anteriors i ens permet comprovar que els costos poden disminuir si adaptem el model teòric a la topologia concreta del sistema d'emmagatzematge distribuït.

Country
Spain
Keywords

Tecnologies, Coding, Data storage, Regenerating codes, 68

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green