VSEncoding

descriptionPublicationkeyboard_double_arrow_right Article , Report , Conference object 26 Oct 2010 Italy Publisher:ACMJournal:Proceedings of the 19th ACM international conference on Information and knowledge management

Authors: Fabrizio Silvestri; Rossano Venturini;

doi: 10.1145/1871437.1871592

handle: 20.500.14243/63149 , 11573/1572863 , 11568/208348

VSEncoding

- Summary
- Subjects
- Metrics

Abstract

Encoding lists of integers in an efficient manner is key task in many applications in different fields. Adjacency lists of large graphs are usually encoded to save space and to im- prove decoding speed. Inverted indexes of Information Re- trieval systems keep the lists of postings usually compressed to allow an optimal utilization of memory hierarchy. Sec- ondary indexes of DBMS's are stored similarly to inverted indexes in IR systems. In this paper we propose a novel class of encoders (called VSEncoding from Vector of Splits Encoding) that, roughly speaking, work by partitioning an list of integers into blocks which are efficiently compressed by using simple encoders. Differently from previous work where heuristics were applied during the partitioning step, we carry out this important step via dynamic programming, which leads to produce the optimal solution. Experiments show that our class of encoders outperform all the existing methods in literature by more than 10% (with the exception of Binary Interpolative Coding with which they, roughly, tie) still retaining very fast decompression.

Country

Italy

Related Organizations

National Research Council
Italy
National Research Council
Sri Lanka
Institute of Information Science and Technologies "A. Faedo"
Italy
University of Pisa
Italy
Sapienza University of Rome
Italy

Keywords

Data Storage Representations, d-gap encoding, Inverted lists, Adaptive encoding, Information Storage, Coding and Information Theory. Data compaction and compression, Adaptive encoding; D-gap encoding; Index compression; Inverted index encoding, Data Compression, Systems and Software, Inverted index encoding

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	48
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

48

Top 10%

Green

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering