
Abstract We discuss a general notion of similarity function between two sequences which is based on their common subsequences. This notion arises in some applications of molecular biology [A.G. D'yachkov, P.L. Erdos, A.J. Macula, V.V. Rykov, D.C. Torney, C.-S. Tung, P.A. Vilenkin, and P.S. White, Exordium for DNA codes, Journal of Combinatorial Optimization 7 (4) (2003)]. We introduce the concept of similarity codes and study the logarithmic asymptotics for the size of optimal codes. Our mathematical results announced in [A.G. D'yachkov, D.C. Torney, P.A. Vilenkin, and P.S.White, On a class of codes for the insertion-deletion metric, Proc. of ISIT–2002, Lausanne, Switzerland, July 2002] correspond to the longest common subsequence (LCS) similarity function [V.I. Levenshtein, Binary codes capable of correcting deletions, insertions, and reversals, J. Soviet Phys.—Doklady, 10, 707–710, 1966] which leads to a special subclass of these codes called reverse-complement (RC) similarity codes. RC codes for additive similarity functions have been studied in previous papers [A.G. D'yachkov and D.C. Torney, On similarity codes, IEEE Trans. Inform. Theory 46 (4) (2000) 1558–1564], [A.G. D'yachkov, D.C. Torney, P.A. Vilenkin, and P.S. White, Reverse– complement similarity codes for DNA sequences, Proc. of ISIT–2000, Sorrento, Italy, July 2000], [P.A. Vilenkin, Some asymptotic problems of combinatorial coding theory and information theory (in Russian), Ph.D. dissertation, Moscow State University, 2000], [V.V. Rykov, A.J. Macula, C.M.Korzelius, D.C. Engelhart, D.C. Torney, and P.S. White, DNA sequences constructed on the basis of quaternary cyclic codes, Proc. of 4-th World Multiconference on Systemics, Cybernetics and Informatics, Orlando, Florida, USA, July 2000].
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 2 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
