
handle: 20.500.14243/63149 , 11573/1572863 , 11568/208348
Encoding lists of integers in an efficient manner is key task in many applications in different fields. Adjacency lists of large graphs are usually encoded to save space and to im- prove decoding speed. Inverted indexes of Information Re- trieval systems keep the lists of postings usually compressed to allow an optimal utilization of memory hierarchy. Sec- ondary indexes of DBMS's are stored similarly to inverted indexes in IR systems. In this paper we propose a novel class of encoders (called VSEncoding from Vector of Splits Encoding) that, roughly speaking, work by partitioning an list of integers into blocks which are efficiently compressed by using simple encoders. Differently from previous work where heuristics were applied during the partitioning step, we carry out this important step via dynamic programming, which leads to produce the optimal solution. Experiments show that our class of encoders outperform all the existing methods in literature by more than 10% (with the exception of Binary Interpolative Coding with which they, roughly, tie) still retaining very fast decompression.
Data Storage Representations, d-gap encoding, Inverted lists, Adaptive encoding, Information Storage, Coding and Information Theory. Data compaction and compression, Adaptive encoding; D-gap encoding; Index compression; Inverted index encoding, Data Compression, Systems and Software, Inverted index encoding
Data Storage Representations, d-gap encoding, Inverted lists, Adaptive encoding, Information Storage, Coding and Information Theory. Data compaction and compression, Adaptive encoding; D-gap encoding; Index compression; Inverted index encoding, Data Compression, Systems and Software, Inverted index encoding
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 48 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
