OpenLink & SAGE Project: Creating a GeoSPARQL Benchmark

The success of the HOBBIT benchmarking platform, along with the fact that many RDF / SPARQL systems are readily available for benchmarking by the platform, made us look at the options of using it as part of another R&D project, called SAGE, in which OpenLink is involved.

The project “SAGE: Semantic Geospatial Analytics” aims to tackle the management of big geospatial data, by developing dedicated algorithms, time-efficient storage and querying strategies, extending the GeoSPARQL standard with continuous queries, and developing scalable link discovery approaches for streaming geospatial data. All these efforts will lower the main barriers to harnessing the full power of geospatial data in companies.

One of the initial tasks in SAGE, therefore, is to ensure GeoSPARQL compliance of OpenLink’s Virtuoso Universal Server. In order to automate the compliance test and make it a general test for all SPARQL processing systems, we opted with developing a GeoSPARQL benchmark, as part of the HOBBIT benchmarking platform.

GeoSPARQL Benchmark for the HOBBIT Platform

The current version of the GeoSPARQL compliance benchmark is based on the example validation queries defined in the GeoSPARQL standard [document #11-052r4]. The same test dataset and test queries are used by SPARQLScore / Tester-for-Triplestores in their GeoSPARQL subtest. Even though they do not explicitly test all requirements defined within the standard, they do test a broad range of different requirement types. With this, they provide a quick and simple compliance test for the standard.

GeoSPARQL benchmark against Virtuoso v8.2

Results from running the current version of the GeoSPARQL benchmark against Virtuoso v8.2 Enterprise Edition at the HOBBIT platform

The initial runs of the benchmark showed that Virtuoso v8.2 Enterprise Edition was able to correctly answer most of the benchmark questions: it scored 4/6 (see Figure above). We got the same results with the latest Virtuoso v7.2 Open Source Edition (with enabled GeoSPARQL support), during internal testing.

Since these initial tests, we at OpenLink have been working on extending the GeoSPARQL support to all aspects of the standard, which would initially improve the score on the benchmark and afterwards provide better query performance for geo-enabled queries. In parallel, as part of the R&D efforts in SAGE, our team will be working on extending the GeoSPARQL benchmark to include and thoroughly test all facets and aspects of the standard.

Spread the word. Share this post!

Leave A Reply

Your email address will not be published. Required fields are marked *