The SQL++ Query Language: Configurable, Unifying and Semi-structured

Preprint English OPEN
Ong, Kian Win ; Papakonstantinou, Yannis ; Vernoux, Romain (2014)
  • Subject: Computer Science - Databases
    acm: InformationSystems_DATABASEMANAGEMENT

NoSQL databases support semi-structured data, typically modeled as JSON. They also provide limited (but expanding) query languages. Their idiomatic, non-SQL language constructs, the many variations, and the lack of formal semantics inhibit deep understanding of the query languages, and also impede progress towards clean, powerful, declarative query languages. This paper specifies the syntax and semantics of SQL++, which is applicable to both JSON native stores and SQL databases. The SQL++ semi-structured data model is a superset of both JSON and the SQL data model. SQL++ offers powerful computational capabilities for processing semi-structured data akin to prior non-relational query languages, notably OQL and XQuery. Yet, SQL++ is SQL backwards compatible and is generalized towards JSON by introducing only a small number of query language extensions to SQL. Recognizing that a query language standard is probably premature for the fast evolving area of NoSQL databases, SQL++ includes configuration options that formally itemize the semantics variations that language designers may choose from. The options often pertain to the treatment of semi-structuredness (missing attributes, heterogeneous types, etc), where more than one sensible approaches are possible. SQL++ is unifying: By appropriate choices of configuration options, the SQL++ semantics can morph into the semantics of existing semi-structured database query languages. The extensive experimental validation shows how SQL and four semi-structured database query languages (MongoDB, Cassandra CQL, Couchbase N1QL and AsterixDB AQL) are formally described by appropriate settings of the configuration options. Early adoption signs of SQL++ are positive: Version 4 of Couchbase's N1QL is explained as syntactic sugar over SQL++. AsterixDB will soon support the full SQL++ and Apache Drill is in the process of aligning with SQL++.
  • References (35)
    35 references, page 1 of 4

    1. S. Abiteboul, P. C. Fischer, and H.-J. Schek, editors. Nested Relations and Complex Objects, Papers from the Workshop "Theory and Applications of Nested Relations and Complex Objects", Darmstadt, Germany, April 6-8, 1987, volume 361 of Lecture Notes in Computer Science. Springer, 1989.

    2. F. Bancilhon, S. Cluet, and C. Delobel. A query language for the O2 object-oriented database system. In DBPL, pages 122-138, 1989.

    3. A. Behm, V. R. Borkar, M. J. Carey, R. Grover, C. Li, N. Onose, R. Vernica, A. Deutsch, Y. Papakonstantinou, and V. J. Tsotras. ASTERIX: towards a scalable, semistructured data platform for evolving-world models. Distributed and Parallel Databases, 29(3):185-216, 2011.

    4. K. S. Beyer, V. Ercegovac, R. Gemulla, A. Balmin, M. Y. Eltabakh, C.-C. Kanne, F. Özcan, and E. J. Shekita. Jaql: A scripting language for large scale semistructured data analysis. PVLDB, 4(12):1272-1283, 2011.

    5. A. Bonifati and S. Ceri. Comparative analysis of five XML query languages. SIGMOD Record, 29(1):68-79, 2000.

    6. Cloudera Impala. cloudera/en/products-and-services/cdh/impala. html.

    7. Couchbase.

    8. DB-Engines ranking. ranking.

    9. G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available keyvalue store. In SOSP, pages 205-220, 2007.

    10. A. Deutsch, M. F. Fernández, D. Florescu, A. Y. Levy, and D. Suciu. A query language for XML. Computer Networks, 31(11-16):1155-1169, 1999.

  • Similar Research Results (2)
  • Metrics
    views in OpenAIRE
    views in local repository
    downloads in local repository
Share - Bookmark