Benchmarking Versioning Systems with SPVB v2.0

There exist a number of storage benchmarks that test the ability of Linked Data systems to store and query data in an efficient way without addressing the management of data versions. To the best of our knowledge only a limited number of versioning systems (mostly academic) and benchmarks exist for storing versions and testing the proposed solutions. However, the existence of such systems and benchmarks is of the utmost importance, as dynamicity is an indispensable part of the Linked Open Data (LOD) initiative [4, 5]. In particular, both the data and schema of LOD datasets are constantly evolving for several reasons, such as the inclusion of new experimental evidence or observations, or the correction of erroneous conceptualizations [3]. The open nature of the Web implies that these changes typically happen without any warning, centralized monitoring, or reliable notification mechanism; this raises the need to keep track of the different versions of the datasets and introduces new challenges related to assuring the quality and traceability of Web data over time.

In this blogpost we discuss the 2nd version of the Semantic Publishing Versioning Benchmark (SPVB v2.0) which is the next version of the SPVB as described in a previous blogpost. The benchmark is based on Linked Data Benchmark Council’s (LDBC) [1] Semantic Publishing Benchmark. It is based on the scenario of the BBC [2] media organisation which makes heavy use of Linked Data Technologies such as RDF and SPARQL. We extend SPB, to produce SPVB, a versioning benchmark that is not tailored to any versioning strategy and system.

The first component of the benchmark is the Data Generator which is responsible for generating the evolved data and the SPARQL queries, along with their expected results, that have to be evaluated by the benchmarked systems. To do so, it requires as parameters:

  • The number of required versions
  • The size of the initial version
  • The proportion of added and deleted triples from one version to another
  • The generated data form (independent copies or change-sets).

The Semantic Publishing Benchmark (SPB) generator is used to produce the initial version of the dataset as long as the add-sets of the upcoming versions. The SPB data generator produces RDF descriptions of creative works that are the metadata about a real entity (or entities) that exist in a set of reference datasets (snapshots of the real datasets provided by BBC in addition to GeoNames and DBpedia datasets). SPB’s data generator models the following three types of relations in the data:

  • Clustering of data: The clustering effect is produced by generating creative works about a single entity from reference datasets and for a fixed period of time.
  • Correlations of entities: This correlation effect is produced by generating creative works about two or three entities from reference data in a fixed period of time.
  • Random tagging of entities: Random data distributions are defined with a bias towards popular entities created when the tagging of a creative work is performed.

SPB follows distributions that have been obtained from real world datasets thereby producing data that bear similar characteristics to real ones. SPVB includes datasets and versions thereof that respect the aforementioned principles. In addition to the produced creative works, 5 different versions of the reference dataset of DBpedia are maintained as well, which are evenly distributed to the total number of versions that the data generator is configured to produce. Regarding the delete-sets, the SPVB data generator selects from each previous version, creative works that were randomly produced by the SPB data generator until it reaches the total number of triples that have to be deleted as specified by the user.

Versioning Benchmark V2.0

Versioning Benchmark V2.0

Next, the data generator for each one of the eight versioning query types [6], produces a set of SPARQL queries that must be executed by the benchmarked systems. For the query types that refer to “structured queries” the generator uses 7 of the 25 query templates proposed by the DBPSB [7], which are based on real DBPedia queries. Then, the generated SPARQL queries are evaluated on top of Virtuoso [8], where the already generated evolved data have been loaded into (each version loaded in its own named graph). After the evaluation of queries, the queries along with the expected results are sent to the Task Generator component and the generated data to the Benchmarked System. The job of the Task Generator is to send the SPARQL queries to the benchmarked system and the expected results to the Evaluation Storage component. So, the system by having both the data and the tasks can evaluate them and report the results to the Evaluation Storage. Finally, the Evaluation Storage can be used by the Evaluation Module component that is responsible for evaluating the results and calculate the values for the KPIs.

In order to evaluate the success of systems we define the following KPIs:

  • Query failures: The number of queries that failed to be executed. Failure refers to the fact that the benchmarked system does not return the expected results.
  • Throughput (in queries per second): The execution rate per second for all queries.
  • Initial version ingestion speed (in triples per second): The total triples that can be loaded per second for the dataset’s initial version.
  • Applied changes speed (in triples per second): The average number of changes that can be stored by the benchmarked system per second after the loading of all new versions.
  • Storage space (in KB): This KPI measures the total storage space required to store all versions.
  • Average query execution time (in ms): The average execution time, in milliseconds for each one of the eight versioning query types (e.g. version materialization, single version queries, cross-version queries etc.
[3] Fouad Zablith, Grigoris Antoniou, Mathieu d’Aquin, Giorgos Flouris, Haridimos Kondylakis, Enrico Motta, Dimitris Plexousakis, and Marta Sabou. Ontology evolution: a process-centric survey.KER, 30(1), 2015.
[4] Jurgen Umbrich, Michael Hausenblas, Aidan Hogan, Axel Polleres, and Stefan Decker. Towards Dataset Dynamics: Change Frequency of Linked Open Data Sources. In LDOW, 2010.
[5] Tobias Käfer, Ahmed Abdelrahman, Jürgen Umbrich, Patrick O’Byrne, and Aidan Hogan. Observing Linked Data Dynamics. In ESWC, 2013.
[6] Vassilis Papakonstantinou, Giorgos Flouris, Irini Fundulaki, Kostas Stefanidis and Giannis Roussakis. Versioning for Linked Data: Archiving Systems and Benchmarks. In: BLINK (2016).
[7] Morsey, Mohamed, et al. “DBpedia SPARQL benchmark–performance assessment with real queries on real data.” International Semantic Web Conference. Springer, Berlin, Heidelberg, 2011.



Spread the word. Share this post!

2 comments on “Benchmarking Versioning Systems with SPVB v2.0”

  1. james anderson Reply

    as you suggest there are just academic systems which manage versioned rdf, i enclose a reference which you may have overlooked:

    if you set up a repository on, it is revisioned. at the moment, there are a bit over two thousand revisioned sparql endpoints in service there. the large majority have few revisions, but several have over a hundred and one has over 7K.

    • Vassilis Papakonstantinou Reply

      Hi James, we actually say that there are mostly academic systems.

      Of course, we are aware of Dydra and its powerful functionalities.

      However, its remote cloud-based instance cannot be benchmarked on HOBBIT platform. Our benchmarking platform is required the benchmarked systems to be uploaded on it.

      It would be great if a docker image of Dydra could be uploaded to the HOBBIT platform, so it could be benchmarked on it.

Leave A Reply

Your email address will not be published. Required fields are marked *