
arXiv: 1805.06177
The Average Common Substring (ACS) is a popular alignment-free distance measure for phylogeny reconstruction. The ACS of a sequence X[1, x] w.r.t. another sequence Y[1, y] is ACS(X, Y) = 1 x ∑ i=1 x max j lcp(X[i, x], Y[j, y]) The lcp(·, ·) of two input sequences is the length of their longest common prefix. The ACS can be computed in O(n) space and time, where n = x + y is the input size. The compressed string matching is the study of string matching problems with the following twist: the input data is in a compressed format and the underling task must be performed with little or no decompression. In this paper, we revisit the ACS problem under this paradigm where the input sequences are given in their run-length encoded format. We present an algorithm to compute ACS(X, Y) in O(N logN) time using O(N) space, where N is the total length of sequences after run-length encoding.
Suffix Trees, FOS: Computer and information sciences, Compression, suffix trees, string algorithms, Algorithms on strings, compression, String Algorithms, RL-encoding, Computer Science - Data Structures and Algorithms, Analysis of algorithms, Data Structures and Algorithms (cs.DS), RL Encoding, Coding and information theory (compaction, compression, models of communication, encoding schemes, etc.) (aspects in computer science)
Suffix Trees, FOS: Computer and information sciences, Compression, suffix trees, string algorithms, Algorithms on strings, compression, String Algorithms, RL-encoding, Computer Science - Data Structures and Algorithms, Analysis of algorithms, Data Structures and Algorithms (cs.DS), RL Encoding, Coding and information theory (compaction, compression, models of communication, encoding schemes, etc.) (aspects in computer science)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 3 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
