
doi: 10.1109/waim.2008.97
We present a SAX implementation of the statistical embedding associated with XML data, introduced in [1], [2], which allows to efficiently decide eps-validity to any DTD or Schema, for the Edit Distance with Moves. It associates a generalized k-gram to unranked labelled trees (with k = 1/epsiv) from which any regular property can be approximately decided. We show how to exactly compute the k-gram with a SAX implementation using a memory of size d, the depth of the tree, and an approximate k-gram with queues of size M = 2k and a global memory of size 2k in the worst-case. Experiments on large XML files from the XML benchmark project confirm the error analysis for various values of M.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
