2004–2008
2004–2008
Xcerpt is a semi-structured query language for the Web, but very much unique among the exemplars of that type of query languages in that it combines aspects of different languages in novel ways aiming towards a versatile query language:
In its use of a graph data model, it stands closely to early semi-structured query languages such as Lorel than to current W3C XML query lan- guages such as XPath, XQuery, or XPath. A graph data model enables Xcerpt to faithfully represent id/idref-links in XML as well as arbitrary RDF graphs. Previous versions of Xcerpt lack (node or object) identity and are thus better characterized as having infinite regular trees as data model. However, Xcerpt 2.0 introduces full node identity and identity variables and thus moves towards a graph data model as in Lorel or object-oriented databases.
In its aim to address all specificities of XML with great care, it resembles current W3C recommended XML query languages such as XSLT or XQuery. Xcerpt is tailored to XML in numerous ways, e.g., by proper support for attributes, namespaces, XML base, comments, and processing-instructions. This is achieved without sacrificing the conceptual simplicity and syntactical conciseness of the language. Some aspects of XML are treated differently than in the W3C query languages, e.g., the transparent resolution of non-hierarchical relations.
In using (slightly enriched) patterns (or templates or examples) of the sought-for data for querying, it resembles the “query-by-example” paradigm and XML query languages such as XML-QL. In contrast, cur- rent XPath, XSLT, and XQuery use navigational access to XML data which is very convenient for unary selection where path expressions can be used, but quickly becomes unwieldy for n-ary queries where more complex, often nested FLWOR loops must be employed.
In offering a consistent extension of XML to overcome certain restrictions of XML, that seem arbitrary in the context of Web querying and Xcerpt in particular, it is ready to incorporate access to data represented in richer data representation formats. Instances of such features are siblings whose relative order is irrelevant (and can not be queried) and more flexible label alphabets.
In providing (syntactical) extensions for querying, among others, RDF, Xcerpt becomes a versatile query language.
In a strict separation of querying and construction and in its use of logical variables and deductive rules, it resembles logic programming languages or Datalog. In contrast, SQL and XQuery, e.g., mix construction and querying (nested queries) and use explicit references to views rather than rule chaining.
Xcerpt
Tuesday, September 18, 2007
… where querying is as easy as creating data!
Prototype:
Web-Demo:
Documentation: