
XML stream applications bring the novel challenge of efficiently processing queries on sequentially accessible token-based input streams. Our Raindrop project is the first to accommodate token-based stream processing using an algebraic framework where both tokens and tuples are modeled in a uniform manner. In this paper, we illustrate how the stream loading model of our system on the fly conducts XML navigation over the input stream via concurrently constructing a minimized light-weight XML tree representation, which is called navigation-free data instance. These captured XML fragments are minimized in terms of buffer consumption. Based on the compact representation of the navigation-free data instances, we propose techniques for subsequent algebraic query evaluation, in particular, effective strategies for supporting multi-mode query operators and alternative data output semantics. The proposed stream loading model requires a much smaller buffer footprint, compared to alternative solutions in the literature such as Y-Filter. And the proposed algebra-based evaluation techniques offer effective ways to handle data recursion over XML streams, i.e., avoiding overhead from the structural join operators. Our stream loading and query evaluation techniques have been implemented as part of the Raindrop system. Experimental results based on the Raindrop system are also reported in this paper.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 3 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
