
pmid: 28858810
The vast majority of phylogenetic databases do not support declarative querying using which their contents can be flexibly and conveniently accessed and the template based query interfaces they support do not allow arbitrary speculative queries. They therefore also do not support query optimization leveraging unique phylogeny properties. While a small number of graph query languages such as XQuery, Cypher, and GraphQL exist for computer savvy users, most are too general and complex to be useful for biologists, and too inefficient for large phylogeny querying. In this paper, we discuss a recently introduced visual query language, called PhyQL, that leverages phylogeny specific properties to support essential and powerful constructs for a large class of phylogentic queries. We develop a range of pruning aids, and propose a substantial set of query optimization strategies using these aids suitable for large phylogeny querying. A hybrid optimization technique that exploits a set of indices and "graphlet" partitioning is discussed. A "fail soonest" strategy is used to avoid hopeless processing and is shown to produce dividends. Possible novel optimization techniques yet to be explored are also discussed.
User-Computer Interface, Databases, Factual, Computational Biology, Programming Languages, Models, Theoretical, Algorithms, Phylogeny
User-Computer Interface, Databases, Factual, Computational Biology, Programming Languages, Models, Theoretical, Algorithms, Phylogeny
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
