
An elastic-degenerate (ED) string T is a sequence T = T[1] · · · T[n] of n finite sets of strings. The cardinality m of T is the total number of strings in T[i], for all i ∈ [1 . . n]. The size N of T is the total length of all m strings of T. ED strings have been introduced to represent a set of closely-related DNA sequences. Let P = P[1 . . p] be a pattern of length p and k > 0 be an integer. We consider the problem of k-Approximate ED String Matching (EDSM): searching k-approximate occurrences of P in the language of T. We call k-Approximate EDSM under the Hamming distance, k-Mismatch EDSM; and we call k-Approximate EDSM under edit distance, k-Edit EDSM. Bernardini et al. (Theoretical Computer Science, 2020) showed a simple O(kmp + kN)-time algorithm for k-Mismatch EDSM and an O(k2mp + kN)-time algorithm for k-Edit EDSM. We improve the dependency on k in both results, obtaining an Õ(k2/3mp +√kN)-time algorithm for k-Mismatch EDSM and an Õ(kmp + kN)-time algorithm for k-Edit EDSM. Bernardini et al. (Theory of Computing Systems, 2024) presented several algorithms for 1-Approximate EDSM working in Õ(np2 + N) time. They have also left the possibility to generalize these solutions for k > 1 as an open problem. We improve the runtime of their solution for 1-Mismatch and 1-Edit EDSM from Õ(np2 + N) to O(np2 + N). We further show algorithms for k-Approximate EDSM for the Hamming and edit distances working in Õ(np2 + N) time, for any constant k > 0. Finally, we show how our techniques can be applied to improve upon the complexity of the k-Approximate ED String Intersection and k-Approximate Doubly EDSM problems that were introduced very recently by Gabory et al. (Information and Computation, 2025).
approximate string matching, Hamming distance, edit distance, ED string, ddc: ddc:004
approximate string matching, Hamming distance, edit distance, ED string, ddc: ddc:004
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
