publication . Conference object . 2017

DottedDB: Anti-Entropy without Merkle Trees, Deletes without Tombstones

Gonçalves, Ricardo Jorge Tomé; Almeida, Paulo Sérgio; Baquero, Carlos; Fonte, Victor;
Open Access
  • Published: 01 Jan 2017
  • Publisher: IEEE
  • Country: Brazil
Abstract
To achieve high availability in the face of network partitions, many distributed databases adopt eventual consistency, allow temporary conflicts due to concurrent writes, and use some form of per-key logical clock to detect and resolve such conflicts. Furthermore, nodes synchronize periodically to ensure replica convergence in a process called anti-entropy, normally using Merkle Trees. We present the design of DottedDB, a Dynamo-like key-value store, which uses a novel node-wide logical clock framework, overcoming three fundamental limitations of the state of the art: (1) minimize the metadata per key necessary to track causality, avoiding its growth even in the...
Subjects
free text keywords: Anti-entropy, Causality, Distributed databases, Logical clocks, Merkle trees, Partial replication, Science & Technology, Distributed database, Computer network, business.industry, business, Logical clock, Theoretical computer science, High availability, Eventual consistency, Distributed computing, Merkle tree, Synchronization, Computer science, Replica, Metadata, Real-time computing
Funded by
FCT| UID/EEA/50014/2013
Project
UID/EEA/50014/2013
INESC TEC - INESC Technology and Science
  • Funder: Fundação para a Ciência e a Tecnologia, I.P. (FCT)
  • Project Code: 147326
  • Funding stream: 5876
38 references, page 1 of 3

[1] A. Verma, L. Pedrosa, M. Korupolu, D. Oppenheimer, E. Tune, and J. Wilkes, “Large-scale cluster management at google with borg,” in Proceedings of the Tenth European Conference on Computer Systems, ser. EuroSys '15. New York, NY, USA: ACM, 2015, pp. 18:1-18:17. [Online]. Available: http://doi.acm.org/10.1145/2741948.2741964

[2] E. A. Brewer, “Towards robust distributed systems,” in PODC, vol. 7, 2000.

[3] S. Gilbert and N. Lynch, “Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services,” ACM SIGACT News, vol. 33, no. 2, pp. 51-59, 2002.

[4] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, P. Vosshall, and W. Vogels, “Dynamo: amazon's highly available key-value store,” in ACM SIGOPS Operating Systems Review, vol. 41, no. 6, 2007, pp. 205-220. [OpenAIRE]

[5] W. Vogels, “Eventually consistent,” Communications of the ACM, vol. 52, no. 1, pp. 40-44, 2009.

[6] A. Lakshman and P. Malik, “Cassandra: a decentralized structured storage system,” ACM SIGOPS Operating Systems Review, vol. 44, no. 2, pp. 35-40, 2010.

[7] R. Klophaus, “Riak core: building distributed applications without shared state,” in ACM SIGPLAN Commercial Users of Functional Programming. ACM, 2010. [OpenAIRE]

[8] A. S. Foundation, “Cassandra,” http://cassandra.apache.org, accessed: 2016-10-20.

[9] Basho, “Riak,” http://basho.com/about/customers/, accessed: 2016-10- 20.

[10] P. Hunt, M. Konar, F. P. Junqueira, and B. Reed, “Zookeeper: Wait-free coordination for internet-scale systems,” in Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference, ser. USENIXATC'10. Berkeley, CA, USA: USENIX Association, 2010, pp. 11-11. [Online]. Available: http://dl.acm.org/citation.cfm?id= 1855840.1855851

[11] D. S. Parker Jr, G. Popek, G. Rudisin, A. Stoughton, B. Walker, E. Walton, J. Chow, and D. Edwards, “Detection of mutual inconsistency in distributed systems,” IEEE Transactions on Software Engineering, pp. 240-247, 1983. [OpenAIRE]

[12] P. S. Almeida, C. Baquero, R. Gonçalves, N. Preguiça, and V. Fonte, “Scalable and accurate causality tracking for eventually consistent stores,” in Distributed Applications and Interoperable Systems. Springer, 2014, pp. 67-81.

[13] W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen, “Don't settle for eventual: scalable causal consistency for wide-area storage with cops,” in Proceedings of the 23rd ACM Symposium on Operating Systems Principles, 2011, pp. 401-416.

[14] --, “Stronger semantics for low-latency geo-replicated storage,” in Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), 2013, pp. 313-328.

[15] P. Bailis, A. Ghodsi, J. M. Hellerstein, and I. Stoica, “Bolt-on causal consistency,” in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. ACM, 2013, pp. 761-772. [OpenAIRE]

38 references, page 1 of 3
Abstract
To achieve high availability in the face of network partitions, many distributed databases adopt eventual consistency, allow temporary conflicts due to concurrent writes, and use some form of per-key logical clock to detect and resolve such conflicts. Furthermore, nodes synchronize periodically to ensure replica convergence in a process called anti-entropy, normally using Merkle Trees. We present the design of DottedDB, a Dynamo-like key-value store, which uses a novel node-wide logical clock framework, overcoming three fundamental limitations of the state of the art: (1) minimize the metadata per key necessary to track causality, avoiding its growth even in the...
Subjects
free text keywords: Anti-entropy, Causality, Distributed databases, Logical clocks, Merkle trees, Partial replication, Science & Technology, Distributed database, Computer network, business.industry, business, Logical clock, Theoretical computer science, High availability, Eventual consistency, Distributed computing, Merkle tree, Synchronization, Computer science, Replica, Metadata, Real-time computing
Funded by
FCT| UID/EEA/50014/2013
Project
UID/EEA/50014/2013
INESC TEC - INESC Technology and Science
  • Funder: Fundação para a Ciência e a Tecnologia, I.P. (FCT)
  • Project Code: 147326
  • Funding stream: 5876
38 references, page 1 of 3

[1] A. Verma, L. Pedrosa, M. Korupolu, D. Oppenheimer, E. Tune, and J. Wilkes, “Large-scale cluster management at google with borg,” in Proceedings of the Tenth European Conference on Computer Systems, ser. EuroSys '15. New York, NY, USA: ACM, 2015, pp. 18:1-18:17. [Online]. Available: http://doi.acm.org/10.1145/2741948.2741964

[2] E. A. Brewer, “Towards robust distributed systems,” in PODC, vol. 7, 2000.

[3] S. Gilbert and N. Lynch, “Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services,” ACM SIGACT News, vol. 33, no. 2, pp. 51-59, 2002.

[4] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, P. Vosshall, and W. Vogels, “Dynamo: amazon's highly available key-value store,” in ACM SIGOPS Operating Systems Review, vol. 41, no. 6, 2007, pp. 205-220. [OpenAIRE]

[5] W. Vogels, “Eventually consistent,” Communications of the ACM, vol. 52, no. 1, pp. 40-44, 2009.

[6] A. Lakshman and P. Malik, “Cassandra: a decentralized structured storage system,” ACM SIGOPS Operating Systems Review, vol. 44, no. 2, pp. 35-40, 2010.

[7] R. Klophaus, “Riak core: building distributed applications without shared state,” in ACM SIGPLAN Commercial Users of Functional Programming. ACM, 2010. [OpenAIRE]

[8] A. S. Foundation, “Cassandra,” http://cassandra.apache.org, accessed: 2016-10-20.

[9] Basho, “Riak,” http://basho.com/about/customers/, accessed: 2016-10- 20.

[10] P. Hunt, M. Konar, F. P. Junqueira, and B. Reed, “Zookeeper: Wait-free coordination for internet-scale systems,” in Proceedings of the 2010 USENIX Conference on USENIX Annual Technical Conference, ser. USENIXATC'10. Berkeley, CA, USA: USENIX Association, 2010, pp. 11-11. [Online]. Available: http://dl.acm.org/citation.cfm?id= 1855840.1855851

[11] D. S. Parker Jr, G. Popek, G. Rudisin, A. Stoughton, B. Walker, E. Walton, J. Chow, and D. Edwards, “Detection of mutual inconsistency in distributed systems,” IEEE Transactions on Software Engineering, pp. 240-247, 1983. [OpenAIRE]

[12] P. S. Almeida, C. Baquero, R. Gonçalves, N. Preguiça, and V. Fonte, “Scalable and accurate causality tracking for eventually consistent stores,” in Distributed Applications and Interoperable Systems. Springer, 2014, pp. 67-81.

[13] W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen, “Don't settle for eventual: scalable causal consistency for wide-area storage with cops,” in Proceedings of the 23rd ACM Symposium on Operating Systems Principles, 2011, pp. 401-416.

[14] --, “Stronger semantics for low-latency geo-replicated storage,” in Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), 2013, pp. 313-328.

[15] P. Bailis, A. Ghodsi, J. M. Hellerstein, and I. Stoica, “Bolt-on causal consistency,” in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. ACM, 2013, pp. 761-772. [OpenAIRE]

38 references, page 1 of 3
Powered by OpenAIRE Open Research Graph
Any information missing or wrong?Report an Issue
publication . Conference object . 2017

DottedDB: Anti-Entropy without Merkle Trees, Deletes without Tombstones

Gonçalves, Ricardo Jorge Tomé; Almeida, Paulo Sérgio; Baquero, Carlos; Fonte, Victor;