
handle: 10356/2373
The rapid growth of Web information and applications has made the Web not only an important source of information but also a hub for e-commerce activities. However, the current unstructured web documents in the form of HTML files have limited support for advanced web applications. To overcome this shortcoming, the future web documents will likely be formatted in XML and existing HTML documents will gradually be converted to XML documents. With XML, the structure of web documents in form of DTDs can be provided as input to a search engine allowing the latter to exploit the structural knowledge in its query processing. In this report, we propose a query model that supports expressive queries on XML documents that share some common DTDs. As XML documents can embed well-structured links among one another, the query model also supports queries involving inter-document links. With both intra- and inter-document structures in our proposed query model, it is clear that the conventional indexing techniques can no longer be adequate. We have therefore designed a new indexing scheme that is built upon both the content and structures of XML documents. Based on the new indexing scheme, a new search engine that supports queries on the content and structures of web documents has been developed. Master of Engineering (SCE)
DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval, 004, :Engineering::Computer science and engineering::Information systems::Information storage and retrieval [DRNTU]
DRNTU::Engineering::Computer science and engineering::Information systems::Information storage and retrieval, 004, :Engineering::Computer science and engineering::Information systems::Information storage and retrieval [DRNTU]
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
