
doi: 10.46298/dmtcs.3553
We consider words with letters from a $q-ary$ alphabet $\mathcal{A}$. The kth subword complexity of a word $w ∈\mathcal{A}^*$ is the number of distinct subwords of length $k$ that appear as contiguous subwords of $w$. We analyze subword complexity from both combinatorial and probabilistic viewpoints. Our first main result is a precise analysis of the expected $kth$ subword complexity of a randomly-chosen word $w ∈\mathcal{A}^n$. Our other main result describes, for $w ∈\mathcal{A}^*$, the degree to which one understands the set of all subwords of $w$, provided that one knows only the set of all subwords of some particular length $k$. Our methods rely upon a precise characterization of overlaps between words of length $k$. We use three kinds of correlation polynomials of words of length $k$: unweighted correlation polynomials; correlation polynomials associated to a Bernoulli source; and generalized multivariate correlation polynomials. We survey previously-known results about such polynomials, and we also present some new results concerning correlation polynomials.
de \\bruijn graph, depth, subword complexity, suffix trees., autocorrelation, [info.info-dm] computer science [cs]/discrete mathematics [cs.dm], [INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS], [MATH.MATH-CO] Mathematics [math]/Combinatorics [math.CO], [INFO.INFO-DM] Computer Science [cs]/Discrete Mathematics [cs.DM], analytic methods, asymptotics, [INFO.INFO-CG] Computer Science [cs]/Computational Geometry [cs.CG], correlation polynomial, [info.info-cg] computer science [cs]/computational geometry [cs.cg], de \\Bruijn graph, QA1-939, Analytic methods, [info.info-ds] computer science [cs]/data structures and algorithms [cs.ds], combinatorics on words, Mathematics, [math.math-co] mathematics [math]/combinatorics [math.co], average-case analysis
de \\bruijn graph, depth, subword complexity, suffix trees., autocorrelation, [info.info-dm] computer science [cs]/discrete mathematics [cs.dm], [INFO.INFO-DS] Computer Science [cs]/Data Structures and Algorithms [cs.DS], [MATH.MATH-CO] Mathematics [math]/Combinatorics [math.CO], [INFO.INFO-DM] Computer Science [cs]/Discrete Mathematics [cs.DM], analytic methods, asymptotics, [INFO.INFO-CG] Computer Science [cs]/Computational Geometry [cs.CG], correlation polynomial, [info.info-cg] computer science [cs]/computational geometry [cs.cg], de \\Bruijn graph, QA1-939, Analytic methods, [info.info-ds] computer science [cs]/data structures and algorithms [cs.ds], combinatorics on words, Mathematics, [math.math-co] mathematics [math]/combinatorics [math.co], average-case analysis
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 3 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
