DARQ -
Federated Queries with SPARQL

Bastian Quilitz
( [bquilitz] [at] gmail [.dot.] com )

Last update:  28.06.2006

There are currently no plans to continue this project.
Read more about the latest changes.

Overview

DARQ is a query engine for federated SPARQL queries. It provides transparent query access to multiple, distributed SPARQL endpoints as if querying a single RDF graph.  DARQ enables the applications to see a single query interface, leaving the details of federation to the query engine. 

DARQ extends Andy Seaborne's ARQ (included in Jena)by adding a new query planning algorithm and a modified query execution engine. The work on DARQ includes a service description language and a basic query optimization algorithm.

The query engine is in an early stage of development. It cannot deal with all SPARQL queries and is not fully tested (see Limitations and known issues).


I'll be happy receive comments and feedback: mailing list

The Sourceforge project page can be found here.

Requirements

Contents

  1. Download and SVN access
  2. Example: Using DARQ
  3. Service Descriptions
  4. Limitations and known issues

Download and SVN access

DARQ is only available as Java source code from the SVN repository.

svn co https://svn.sourceforge.net/svnroot/darq/darq/trunk darq

Example: Using DARQ

Command line (Linux)

$DARQROOT/bin/darq --query <queryfile> --config <configfile>

<queryfile> file with SPARQL query
<configfile> file with Service Descriptions
There is an detailed example.

Source Code

DARQ provides a single query interface (same as ARQ), leaving the details of federation to the query engine. The example registers the DARQ query engine, executes the query and outputs the results. 

When registering the query engine DARQ requires a configuration file which includes the Service Descriptions.

// register new FedQueryEngineFactory and load configuration from file
FedQueryEngineFactory.register(configfile);

// create query
Query query = QueryFactory.create(querystring);

// get query engine
// DarqDataset is a dummy.
QueryExecution qe = QueryExecutionFactory.create(query, new DarqDataset());

// execute query
ResultSet rs = qe.execSelect();

// output results
ResultSetFormatter.out(System.out, rs, query);

Service Descriptions

Service Descriptions specify the capabilities of a SPARQL endpoint. They provide a declarative description of the data available from an endpoint, the definition of limitations on access patterns and statistical information about the available data that is used for query optimization.

Following shows an example for Service Descriptions:
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix sd: <http://darq.sf.net/dose/0.1#> .

@prefix foaf: <http://xmlns.com/foaf/0.1/#> .

# definition of an endpoint
[] a sd:Service ;
rdfs:label "Foaf Service" ;
rdfs:comment "Service for FOAF data" ;

# the endpoint url
sd:url "http://localhost:2020/ldap" ;

# capabilities of the endpoint
sd:capability [

# the endpoint stores triples with predicate rdf:type
sd:predicate rdf:type ;

# Restriction on the subject/object
# Every legal SPARQL filter expression is allowed.
#
# only queries for the type http://xmlns.com/foaf/0.1/#Person
# are allowed

sd:sofilter "REGEX(STR(?object),'http://xmlns.com/foaf/0.1/#Person')" ;
# could also use ?subject

# statistical information

# number of triples that will be returned by
# a "?s rdf:type ?o" query
sd:triples 18000 ;

# other propeties are:
# Selectivity of a triple pattern, when object/subject is bound
# sd:objectSelectivity (default=1)
# sd:subjectSelectivity (default=1/x,
# where x is the value given by sd:triples)

];

sd:capability [

# the endpoint stores triples with predicate foaf:name
sd:predicate foaf:name ;

# no filter on subject or object
sd:sofilter "" ;

# statistical information

# there are 18000 triples with predicate foaf:name
sd:triples 18000 ;

# if the object in the triple pattern is bound
# (e.g. ?s foaf:name 'Bastian Quilitz") the result size will be
# reduced by factor 0.02. (on average)
sd:objectSelectivity "0.02"^^xsd:double ;

] ;

sd:capability [
sd:predicate foaf:mbox ;
sd:sofilter "" ;
sd:triples 18000 ;
sd:objectSelectivity 5.5E-5
] ;


# whether the service is definitive or not
# sd:isDefinitive (default=false)
# sd:isDefinitive "true"^^xsd:boolean ;

# limitations on access patterns
# the query for this service must either contain a triple pattern
# with predicate foaf:name and a bound object or
# a pattern with predicate foaf:mbox and a bound object.
# not shown here: sd:subjectBinding -> subject must be bound
sd:requiredBindings [ sd:objectBinding foaf:name ] ;
sd:requiredBindings [ sd:objectBinding foaf:mbox ] ;

# total number of triples in the store
sd:totalTriples 108000;
.

Limitations and known issues





DARQ is hosted by sourceforge.net
SourceForge.net Logo