
doi: 10.3390/a14080225
Subgraph and supergraph search methods are promising techniques for the development of new drugs. For example, the chemical structure of favipiravir—an antiviral treatment for influenza—resembles the structure of some components of RNA. Represented as graphs, such compounds are similar to a subgraph of favipiravir. However, the existing supergraph search methods can only discover compounds that match exactly. We propose a novel problem, called similar supergraph search, and design an efficient algorithm to solve it. The problem is to identify all graphs in a database that are similar to any subgraph of a query graph, where similarity is defined as edit distance. Our algorithm represents the set of candidate subgraphs by a code tree, which it uses to efficiently compute edit distance. With a distance threshold of zero, our algorithm is equivalent to an existing efficient algorithm for exact supergraph search. Our experiments show that the computation time increased exponentially as the distance threshold increased, but increased sublinearly with the number of graphs in the database.
labeled graph, Industrial engineering. Management engineering, Electronic computers. Computer science, similarity graph, graph edit distance, QA75.5-76.95, T55.4-60.8, supergraph search
labeled graph, Industrial engineering. Management engineering, Electronic computers. Computer science, similarity graph, graph edit distance, QA75.5-76.95, T55.4-60.8, supergraph search
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 4 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
