Querying the Web with SPARQL

descriptionPublicationkeyboard_double_arrow_right Part of book or chapter of book , Conference object , Article , Contribution for newspaper or weekly magazine 01 Jan 2006Publisher:Springer Berlin Heidelberg

Authors: Parsia, Bijan;

doi: 10.1007/11837787_2

Querying the Web with SPARQL

- Summary
- Metrics

Abstract

In both conceptions, the common factor (the web) imposes certain requirements: extremely variable scalability (from a home page to community sites to sites that encompass a significant fraction of the web), rapid evolution, radical distribution, arbitrary interconnection and aggregation, and very little validation or other means of control. The demands of the web are forcing both the knowledge representation (KR) and the database communities to stretch their understanding and technology in different ways. While implementation techniques require revamping to deal with web scale, finding the right level and sort of expressiveness is even more critical. The web doesn’t just need bigger databases, it needs “better” ones. The rise of semi-structured data, especially in the form of XML and associated languages, is driven by the success of HTML as a data representation language as well as its many failures. The amount of data that has been created or converted to HTML is staggering. HTML allows novices to publish all sorts of information quite easily while also supporting complex information structures (for example, see the typical site map of a large site). However, HTML is lacking in a number of ways, especially in the management, evolution, integration, and repurposing of data. HTML, especially in common use, has (at least) three fundamental problems: malformed or misused constructs, a heavy presentation orientation, and a lack of needed expressivity. These problems stem from aspects of HTML (and associated software like the browser) that, we believe, contributed to its success. Browsers were very permissive in their parsing and rendering of HTML, which lowered the barrier to producing pages. Various presentation features in HTML made it an attractive platform for publishing information from software manuals to dictionaries to newspapers with ads. HTML’s core simplicity requires a lack of expressivity, which makes it easier to learn (and to learn to “abuse”). More significantly, by pushing the balance of expressivity (and thus complexity) toward the presentation aspects of the language, it was relatively neutral toward content of different sorts. Consider the effect of requiring a specialized content language to be developed before one could publish, say, a recipe. Either the user would have to develop their own

Related Organizations

University of Salford
United Kingdom

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

2

Average

Upload OA version

Are you the author of this publication? Upload your Open Access version to Zenodo!

It’s fast and easy, just two clicks!

uploadUpload now