
arXiv: 1403.6414
Several algorithms for similarity search employ seeding techniques to quickly discard very dissimilar regions. In this paper, we study theoretical properties of lossless seeds, i.e., spaced seeds having full sensitivity. We prove that lossless seeds coincide with languages of certain sofic subshifts, hence they can be recognized by finite automata. Moreover, we show that these subshifts are fully given by the number of allowed errors k and the seed margin l. We also show that for a fixed k, optimal seeds must asymptotically satisfy l ~ m^(k/(k+1)).
In Proceedings AFL 2014, arXiv:1405.5272
FOS: Computer and information sciences, Discrete Mathematics (cs.DM), QA75.5-76.95, lossless seeds, [MATH.MATH-CO] Mathematics [math]/Combinatorics [math.CO], pattern matching, Electronic computers. Computer science, QA1-939, FOS: Mathematics, Mathematics - Combinatorics, Combinatorics (math.CO), Mathematics, Computer Science - Discrete Mathematics
FOS: Computer and information sciences, Discrete Mathematics (cs.DM), QA75.5-76.95, lossless seeds, [MATH.MATH-CO] Mathematics [math]/Combinatorics [math.CO], pattern matching, Electronic computers. Computer science, QA1-939, FOS: Mathematics, Mathematics - Combinatorics, Combinatorics (math.CO), Mathematics, Computer Science - Discrete Mathematics
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
