LZ-Compressed String Dictionaries

Preprint English OPEN
Arz, Julian; Fischer, Johannes;
  • Subject: Computer Science - Data Structures and Algorithms

We show how to compress string dictionaries using the Lempel-Ziv (LZ78) data compression algorithm. Our approach is validated experimentally on dictionaries of up to 1.5 GB of uncompressed text. We achieve compression ratios often outperforming the existing alternatives... View more
  • References (11)
    11 references, page 1 of 2

    1. P. Boldi, B. Codenotti, M. Santini, and S. Vigna. Ubicrawler: A scalable fully distributed web crawler. Software: Practice & Experience, 34(8):711{726, 2004.

    2. N. R. Brisaboa, R. Canovas, F. Claude, M. A. Mart nez-Prieto, and G. Navarro. Compressed string dictionaries. In Proc. SEA, volume 6630 of LNCS, pages 136{ 147. Springer, 2011.

    3. P. Ferragina, R. Grossi, A. Gupta, R. Shah, and J. S. Vitter. On searching compressed string collections cache-obliviously. In Proc. PODS, pages 181{190. ACM, 2008.

    4. P. Ferragina and G. Navarro. The Pizza&Chili Corpus. http://pizzachili.dcc. uchile.cl/texts/dna/.

    5. P. Ferragina and R. Venturini. The compressed permuterm index. ACM Transactions on Algorithms, 7(1):Article No. 10, 2010.

    6. S. Gog. Succinct Data Structures Library. http://simongog.github.io/sdsl/.

    7. R. Grossi and G. Ottaviano. Fast compressed tries through path decompositions. In Proc. ALENEX, pages 65{74. SIAM Press, 2012.

    8. J.-B. Michel et al. Quantitative analysis of culture using millions of digitized books. Science, 331(6014):176{182, 2011.

    9. J. I. Munro and V. Raman. Succinct representation of balanced parentheses and static trees. SIAM J. Comput., 31(3):762{776, 2001.

    10. D. Okanohara. Tx: Succinct trie data structure. http://code.google.com/p/ tx-trie/.

  • Similar Research Results (2)
  • Metrics
Share - Bookmark