
Graph data appears in a variety of application domains, and many uses of it, such as querying, matching, and transforming data, naturally result in incompletely specified graph data, that is, graph patterns. While queries need to be posed against such data, techniques for querying patterns are generally lacking, and properties of such queries are not well understood. Our goal is to study the basics of querying graph patterns. The key features of patterns we consider here are node and label variables and edges specified by regular expressions. We provide a classification of patterns, and study standard graph queries on graph patterns. We give precise characterizations of both data and combined complexity for each class of patterns. If complexity is high, we do further analysis of features that lead to intractability, as well as lower-complexity restrictions. Since our patterns are based on regular expressions, query answering for them can be captured by a new automata model. These automata have two modes of acceptance: one captures queries returning nodes, and the other queries returning paths. We study properties of such automata, and the key computational tasks associated with them. Finally, we provide additional restrictions for tractability, and show that some intractable cases can be naturally cast as instances of constraint satisfaction problems.
query languages, Graph databases, automata, Analysis of algorithms and problem complexity, Database theory, constraint satisfaction, Formal languages and automata, 004, graph patterns, graph databases, Languages, Theory, Small world graphs, complex networks (graph-theoretic aspects), complexity, Algorithms
query languages, Graph databases, automata, Analysis of algorithms and problem complexity, Database theory, constraint satisfaction, Formal languages and automata, 004, graph patterns, graph databases, Languages, Theory, Small world graphs, complex networks (graph-theoretic aspects), complexity, Algorithms
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 38 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
