Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Aperta - TÜBİTAK Açı...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Aperta - TÜBİTAK Açık Arşivi
Other literature type . 2011
License: CC BY
https://doi.org/10.1109/dcc.20...
Article . 2011 . Peer-reviewed
Data sources: Crossref
versions View all 2 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Compressed Context Modeling for Text Compression

Authors: Kulekci, M. Oguzhan;

Compressed Context Modeling for Text Compression

Abstract

In text compression, statistical context modeling aims to construct a model to calculate the probability distribution of a character based upon its context. The order -- $k$ context of a symbol is defined as the string formed by its preceding $k$ symbols. This study introduces compressed context modeling, which defines the order -- $k$ context of a character as the sequence of $k$-bits composed of the entropy compressed representations of its preceding characters. While computing the compressed context of a symbol at some position in a given text, enough number of characters are involved in the compressed context so as to produce $k$-bits of information. Thus, instead of certain number of characters, certain amount of \emph{information} is considered as the context of a character, and this property enables the prediction of each character to be performed with nearly uniform amount of information. Experiments are conducted to compare the proposed modeling against the classical fixed-length context definitions. The files in the large Calgary corpus are modeled with the newly introduced compressed context modeling and with the classical fixed-length context modeling. It is observed that on the average the statistical model with the proposed method uses $13.76$ percent less space measured according to the number of distinct contexts, while providing $5.88$ percent gain in empirical entropy measured by the information content as bits per character.

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green