A Greedy Approach for Caching in Distributed Data Stores

descriptionPublicationkeyboard_double_arrow_right Article , Conference object , Other literature type 01 Nov 2017Publisher:IEEEJournal:2017 IEEE International Conference on Smart Cloud (SmartCloud)

Authors: Longbin Chen; Wenyun Dai; Meikang Qiu;

doi: 10.1109/smartcloud.2017.46

A Greedy Approach for Caching in Distributed Data Stores

- Summary
- Metrics

Abstract

The high latency of read and write makes disks no longer the popular place for data. To meet millions or billions of requests, mostly read, distributed data stores have to rely on memory systems. The cache system in data stores are different from CPU cache. For CPU cache, most workloads have short life span. But in distributed storage, workloads can last for days, months, and even years. How to classify storage workloads is critical to efficient in-memory data stores. Even though memory technologies have experienced dramatically improvement, their capacity are still not enough to hold all data. Cache eviction algorithms like LRU (last recently used) and LFU (least frequently used) are widely adopted by many in-memory systems. However, the uncertainty of storage workloads make these algorithms less efficient and accurate. Because they might ignore or distort important patterns based on their usage of history statistics. To address these issues, we propose a greedy approach for caching in distributed data stores. Our approach leverages the cache system by combining advantages of both LRU and LFU. Our approach requires only a temporary data structure to determine which data to evict. We compare our method with LRU approach. The evaluation shows that our greedy approach reduces the latency by 50% and doubles the throughput for reads. It also improves the performance of the data store for writes by a small fraction.

Related Organizations

Pace University
United States

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	3
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

3

Average

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now