descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Jan 2022Embargo end date: 01 Jan 2022Publisher:arXivFunded by:ANR | EQUUS

Authors: Amarilli, Antoine; Monet, Mikaël;

doi: 10.48550/arxiv.2209.14878

arXiv: 2209.14878

Enumerating Regular Languages with Bounded Delay

- Summary
- Subjects
- Metrics

Abstract

We study the task, for a given language $L$, of enumerating the (generally infinite) sequence of its words, without repetitions, while bounding the delay between two consecutive words. To allow for delay bounds that do not depend on the current word length, we assume a model where we produce each word by editing the preceding word with a small edit script, rather than writing out the word from scratch. In particular, this witnesses that the language is orderable, i.e., we can write its words as an infinite sequence such that the Levenshtein edit distance between any two consecutive words is bounded by a value that depends only on the language. For instance, $(a+b)^*$ is orderable (with a variant of the Gray code), but $a^* + b^*$ is not. We characterize which regular languages are enumerable in this sense, and show that this can be decided in PTIME in an input deterministic finite automaton (DFA) for the language. In fact, we show that, given a DFA $A$, we can compute in PTIME automata $A_1, \ldots, A_t$ such that $L(A)$ is partitioned as $L(A_1) \sqcup \ldots \sqcup L(A_t)$ and every $L(A_i)$ is orderable in this sense. Further, we show that the value of $t$ obtained is optimal, i.e., we cannot partition $L(A)$ into less than $t$ orderable languages. In the case where $L(A)$ is orderable (i.e., $t=1$), we show that the ordering can be produced by a bounded-delay algorithm: specifically, the algorithm runs in a suitable pointer machine model, and produces a sequence of bounded-length edit scripts to visit the words of $L(A)$ without repetitions, with bounded delay -- exponential in $|A|$ -- between each script. In fact, we show that we can achieve this while only allowing the edit operations push and pop at the beginning and end of the word, which implies that the word can in fact be maintained in a double-ended queue.

This is the full versions with proofs of the STACS'23 article

Related Organizations

Leibniz Association
Germany
French Institute for Research in Computer Science and Automation
France
French National Centre for Scientific Research
France
University of Lille
France
INSTITUT POLYTECHNIQUE DE PARIS
France

View all View all

Keywords

FOS: Computer and information sciences, Formal Languages and Automata Theory (cs.FL), Edit distance, Theory of computation → Formal languages and automata theory, constant-delay enumeration, Regular language, Computer Science - Formal Languages and Automata Theory, edit distance, [INFO] Computer Science [cs], 004, Constant-delay enumeration, Computer Science - Data Structures and Algorithms, Data Structures and Algorithms (cs.DS), ddc: ddc:004

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Average

Green

Fields of Science (3) View all

Fields of Science

Funded by

Related to Research communities

INRIA