
doi: 10.1109/dcc.2006.15
Summary form only given. Multilingual text compression depends primarily on the ability to match the corresponding parts of related texts by identifying semantic correspondences across the various sub-texts, a task generally referred to as text alignment. Savings in storage space can be obtained by replacing words and phrases with pointers to their translations, determined by any alignment algorithm. The suggested method was tested on an English-French corpus of the European Union. The French part was compressed using pointers towards the English part. The obtained compression rate (22.0%) is similar to the performances of Bzip and HuffWord and better than that of Gzip. However, Bzip and Gzip's performances degrade when small sub-sections are processed separately, which makes them inappropriate for systems which often decode only small pieces.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
