After Virtuoso took part in the MOCHA 2017 and MOCHA 2018 challenges, and won both, we at OpenLink wanted to benchmark and evaluate a set of different Virtuoso versions using the same benchmark setup, in order to get a comparative analysis and showcase the progress of its development. The benchmark which we set out to use for this was the Data Storage benchmark (DSB) , developed as part of the HOBBIT project.
The Data Storage benchmark aims to measure how data storage solutions perform with interactive read SPARQL queries, accompanied with a high insert data rate via SPARQL UPDATE queries. This approach mimics realistic application scenarios where read and write operations are bundled together. It also tests systems for their bulk load capabilities.
We benchmarked three Virtuoso commercial versions:
- Virtuoso v7.20.3221 (Aug 2017)
- Virtuoso v7.20.3227 (Apr 2018)
- Virtuoso v8.0
The evaluation consisted of two DSB benchmark setups, which we will refer to as Test 1 and Test 2, accordingly. Their details are presented in the table below.
|Test 1||Test 2|
|Benchmark||Data Storage Benchmark||Data Storage Benchmark|
|Number of Operations||30.000||30.000|
|Time Compression Ratio||1||0.5|
The main difference between the two tests is the manner of query execution. The first test uses sequential issuing and execution of the SPARQL queries which comprise the benchmark query mix (SPARQL SELECT and SPARQL INSERT queries), which means that a new query is issued by the benchmark only after the previous query has finished. This approach allows us to test a data storage system for its capability to handle queries when it can use all available resources to respond to a single query at a time.
On the other hand, the second test uses parallel query issuing and execution, where the queries of the benchmark query mix are issued by the benchmark according to a synthetic operation order, which mimics actual user behaviour on an online social network. Here, the operations / queries overlap, which allows us to test a data storage system for its capability to handle a real-world usage load.
The scale factor parameter defines the size of dataset, and its value of 30 means that we use an RDF dataset of 1.4 billion triples. After bulk loading it, the benchmark issues 30.000 SPARQL queries (INSERT and SELECT) which are executed against the system under test. The TCR value determines if the benchmark issues the queries in the same time scale as they are timestamped in the dataset, or speeds them up / slows them down. For the sequential issuing of queries this parameter is irrelevant. The TCR = 0.5 parameter for the parallel task means that the time simulated by the benchmark is slowed down by 50%. This implies that the benchmark issues around 17 queries per second.
For each test and each system, we executed the experiment twice, in order to eliminate the chance of coincidence, the influence of possible current server occupancy and the like, and to get numbers which we can rely on.
Test 1: Sequential Query Execution
The main DSB KPIs are the average query execution time and the loading time. From the following chart, the achieved Virtuoso progress across versions is more than evident in both aspects: loading and querying performance. Virtuoso v8.0 (red bars) is faster by 14% than the newer v7.2 (green bars), and by almost 50% than the older v7.2 (blue bars) regarding average query execution time, while the differences in loading times are not that large (5%).
The main reasons of this dominance can be seen on the following charts, where we compare different versions of Virtuoso in average query execution time per query type. There are 3 charts, representing complex SELECT queries (Q*), short lookups (S*) and INSERT queries (U*). For all short lookups and updates, and for the majority of complex queries, v8.0 is faster than the other two.
Test 2: Parallel Query Execution
In this set of experiments, the difference between versions is not that large, but it is still obvious that the newer versions are significantly faster. All query executions are slower in this test compared to Test 1, due to the high throughput and having many queries submitted at the same time. Here, we present the same charts, with the same meaning and colors.
The evaluation of a set of different Virtuoso versions using the Data Storage benchmark showed significant improvement of our system through versions. The latest version, Virtuoso v8.0, exhibits largely improved query execution times, compared to Virtuoso v7.2. In the future we plan to run similar experiments using the other HOBBIT benchmarks, in order to showcase other areas of improvement and detect possible optimization goals for future development.
Authors: Mirko Spasić and Milos Jovanovik (OpenLink Software)
 Milos Jovanovik and Mirko Spasić. “First Version of the Data Storage Benchmark”, May 2017. Project HOBBIT Deliverable 5.1.1.
 Mirko Spasić and Milos Jovanovik. “Second Version of the Data Storage Benchmark”, May 2018. Project HOBBIT Deliverable 5.1.2.