
HBase is a popular distributed Key/Value storage system based on the idea of BigTable. It is being used in many data-centers, such as Facebook and Twitter, for their portability and scalability. For the system, low-latency and large storage is expected when used in industry. However, it is time consuming when retrieving one column via another one. Many technologies were considered to solve the problem. One approach is to add secondary index for HBase such as h index, which achieves high performance in retrieving. Unfortunately, when one column is of limited kinds of data, secondary index cannot reduce storage consumption when accelerating the retrieval. In this paper, we present a novel design of HBase to reduce storage consumption as well as accelerating the retrieval in the above situation. We design an enumeration system for HBase and provide an interface to create enumeration for specific column in tables. Our performance evaluation reveals that it achieves 2.27x improvement in retrieval and 12x reduction in storage compared with non-enumeration in HBase.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
