A SPARQL query in ARQ goes through several stages of processing:
- String to Query (parsing)
- Translation from Query to a SPARQL algebra expression
- Optimization of the algebra expression
- Query plan determination and low-level optimization
- Evaluation of the query plan
This page describes how to access and use expressions in the SPARQL algebra within ARQ. The definition of the SPARQL algebra is to be found in the SPARQL specification in section 12. ARQ can be extended to modify the evaluation of the algebra form to access different graph storage implementations.
The classes for the datastructures for the algebra resize in the
package org.apache.jena.sparql.algebra
in the op
subpackage.
All the classes are names “Op...
”; the interface that they all
offer is “Op
”.
Viewing the algebra expression for a Query
The command line tool arq.qparse will print the algebra form of a query:
arq.qparse --print=op --query=Q.rq
arq.qparse --print=op 'SELECT * { ?s ?p ?o}'
The syntax of the output is SSE, a simple format for writing data structures involving RDF terms. It can be read back in again to produce the Java form of the algebra expression.
Turning a query into an algebra expression
Getting the algebra expression for a Query is simply a matter of
passing the parsed Query object to the transaction function in the
Algebra
class:
Query query = QueryFactory.create(.....) ;
Op op = Algebra.compile(query) ;
And back again.
Query query = OpAsQuery.asQuery(op) ;
System.out.println(query.serialize()) ;
This reverse translation can handle any algebra expression originally from a SPARQL Query, but not any algebra expression. It is possible to create programmatically useful algebra expressions that can not be turned into a query, especially if they involve algebra. Also, the query produced may not be exactly the same but will yield the same results (for example, filters may be moved because the SPARQL query algebra translation in the SPARQL specification moves filter expressions around).
Directly reading and writing algebra expression
The SSE class is a collection of functions to parse SSE expressions for the SPARQL algebra but also RDF terms, filter expressions and even dataset and graphs.
Op op = SSE.parseOp("(bgp (?s ?p ?o))") ; // Read a string
Op op = SSE.readOp("filename.sse") ; // Read a file
The SSE class simply calls the appropriate builder operation from
the org.apache.jena.sparql.sse.builder
package.
To go with this, there is a collection of writers for many of the Java structures in ARQ.
Op op = ... ;
SSE.write(op) ; // Write to stdout
Writers default to writing to System.out
but support calls to any
output stream (it manages the conversion to UTF-8) and ARQ own
IndentedWriter
s form for embedding in structured output. Again,
SSE is simply passing the calls to the writer operation from the
org.apache.jena.sparql.sse.writer
package.
Creating an algebra expression programmatically
See the example in AlgebraExec.
To produce the complete javadoc for ARQ, download an ARQ distribution and run the ant task ‘javadoc-all’.
Evaluating a algebra expression
QueryIterator qIter = Algebra.exec(op,graph) ;
QueryIterator qIter = Algebra.exec(op,datasetGraph) ;
Evaluating an algebra expression produces a iterator of query solutions (called Bindings).
for ( ; qIter.hasNext() ; )
{
Binding b = qIter.nextBinding() ;
Node n = b.get(var_x) ;
System.out.println(var_x+" = "+FmtUtils.stringForNode(n)) ;
}
qIter.close() ;
Operations of CONSTRUCT
, DESCRIBE
and ASK
are done on top
of algebra evaluation. Applications can access this functionality
by creating their own QueryEngine
(see
arq.examples.engine.MyQueryEngine
) and it’s factory. A query
engine is a one-time use object for each query execution.