HOBBIT https://project-hobbit.eu Holistic Benchmarking of Big Linked Data Tue, 04 Dec 2018 10:18:38 +0000 en-US hourly 1 https://wordpress.org/?v=5.4.1 https://project-hobbit.eu/wp-content/uploads/2015/12/cropped-icon-32x32.png HOBBIT https://project-hobbit.eu 32 32 HOBBIT @ ISWC 2018 https://project-hobbit.eu/hobbit-iswc-2018/?utm_source=rss&utm_medium=rss&utm_campaign=hobbit-iswc-2018 https://project-hobbit.eu/hobbit-iswc-2018/#respond Tue, 20 Nov 2018 17:56:59 +0000 https://project-hobbit.eu/?p=4547 We are very pleased to announce that HOBBIT members presented 7 papers and organized a workshop at ISWC 2018, which was held on 8-12 October, 2018 in Monterey, USA. The International Semantic Web Conference (ISWC) is the premier international forum where Semantic Web / Linked Data researchers, practitioners, and industry specialists come together to discuss, advance, and shape the future of semantic technologies on the web, within enterprises and in the context of the public institution.

Muhammad Saleem (InfAI) presented LargeRDFBench, a billion-triple benchmark for SPARQL query federation which encompasses real data as well as real queries pertaining to real bio-medical use cases. State-of-the-art SPARQL endpoint federation approaches are evaluated using this benchmark with respect to different performance measures. He also presented a demo of the SQCFrameWork, a SPARQL query containment benchmark generation framework. SQCFrameWork is able to generate customized SPARQL containment benchmarks from real SPARQL query logs. The framework is flexible enough to generate benchmarks of varying sizes and complexities according to user-defined criteria on important SPARQL features for query containment benchmarking. Finally, he also presented the extension of LargeRDFBench which pinpoint some of the limitations of LargeRDFBench and propose an extension with 8 additional queries. The evaluation results of the state-of-the-art federation engines revealed interesting insights, when tested on these additional queries. This paper was presented at the Scalable Semantic Web Knowledge Base Systems  (SSWS) and was also selected as one of the best workshop’s papers to be published at an ISWC satellite event.

Tzanina Saveta (FORTH) presented the Ontology Alignment Evaluation Initiative 2017.5 pre-campaign at the OAEI 2018 workshop. Like in 2012, when OAEI transitioned the evaluation to the SEALS platform, a pre-campaign was organized to assess the feasibility of moving from SEALS to the HOBBIT platform. Tzanina presented the feedback from the developers who integrated their systems and benchmarks in the HOBBIT platform. During the presentation the future steps for the use of the HOBBIT Platform for the OAEI challenge(s) were also discussed. Tzanina also presented the Spatial Benchmark Generator SPgen at the ISWC Research Track that can be used to test the performance of link discovery systems which deal with topological relations as proposed in the state of the art DE-9IM (Dimensionally Extended nine-Intersection Model). SPgen implements all topological relations of DE-9IM between LineStrings and Polygons in the two-dimensional space.

Dubey Mohnish (IAIS) presented a framework called EARL, which performs entity linking and relation linking as a joint task at the ISWC Research Track.  The framework relies on three base features and re-ranking steps in order to predict entities and relations.

Irini Fundulaki (FORTH) presented in detail the Semantic Publishing Versioning Benchmark (SPVB) at the 12th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS 2018). SPVB  aims to test the ability of versioning systems to efficiently manage versioned Linked Data datasets and queries evaluated on top of these datasets. The paper also appeared in “Emerging Topics in Semantic Technologies. ISWC 2018 Satellite Events” with AKA Verlag, since had been selected as one of the best papers of SSWS 2018.

InfAI was a co-organizer of the 4th Natural Language Interface for the Web of Data (NLIWOD) workshop and 9th Question Answering over Linked Data challenge (QALD-9). The workshop is a joint event of two active communities in the area of interaction paradigms to Linked Data: NLIWOD4 and QALD-9. NLIWOD, a workshop for discussions on the advancement of natural language interfaces to the Web of Data. QALD is a benchmarking campaign powered by HOBBIT (project-hobbit.eu) .

Finally, Irini Fundulaki (FORTH) gave a keynote talk on the use of benchmarks for Health-Care solutions at the First International Workshop on Semantic Web Technologies for Health Data Management where she talked about benchmarking principles and methodology.

 

]]>
https://project-hobbit.eu/hobbit-iswc-2018/feed/ 0
HOBBIT Highlights: Summary of HOBBIT challenges (1st & 2nd period) https://project-hobbit.eu/hobbit-highlights-summary-of-hobbit-challenges-1st-2nd-period/?utm_source=rss&utm_medium=rss&utm_campaign=hobbit-highlights-summary-of-hobbit-challenges-1st-2nd-period https://project-hobbit.eu/hobbit-highlights-summary-of-hobbit-challenges-1st-2nd-period/#respond Mon, 05 Nov 2018 07:45:25 +0000 https://project-hobbit.eu/?p=4468 The HOBBIT project has successfully organized two series of challenges in order to measure the performance of implemented systems in processing Big Linked Data. In total, we organized ten (10) challenges, five (5) in each period. In particular, during the first period, HOBBIT organized:

For the second period, we were pleased to re-organize the challenges that we had run at ESWC 2017 (except for the QALD challenge), ISWC 2017 and DEBS 2017. The only exception was that, instead of the QALD challenge, we organized the 1st edition of the SQA challenge as an “offspring” of the QALD challenge. Specifically, during the second period, HOBBIT organized:

Challenges consisted of multiple benchmarking tasks prepared by the HOBBIT partners;  participants were invited to submit systems that tackled one or more of a challenge’s tasks. The participation in the challenges of the first period was satisfying, with a total of 22 systems. The successful organization of the challenges in the first period rewarded the HOBBIT project with an increased number of participants in the second period. More precisely, 31 systems in total took part in the second round of the challenges. Overall, the second series of challenges attracted 59 systems (i.e. 30 participating systems and 29 benchmarks tested internally by the challenges’ teams).

The HOBBIT platform successfully supported all challenges during these two periods. Participating systems were tested using benchmarks developed by HOBBIT and their evaluation was conducted on top of the HOBBIT platform. The challenges’ results were presented to the public in dedicated workshop sessions that attracted several attendees. Importantly, the qualitative evaluation of the challenges via questionnaires, in each period separately, provided to the HOBBIT consortium valuable feedback to improve its documentation, as well as to enhance the platform. In this way, not only we created a faithful growing audience for all HOBBIT challenges but the HOBBIT platform also became established as a reliable benchmarking solution for all steps of the Linked Data lifecycle.

 

]]>
https://project-hobbit.eu/hobbit-highlights-summary-of-hobbit-challenges-1st-2nd-period/feed/ 0
OpenLink & SAGE Project: Creating a GeoSPARQL Benchmark https://project-hobbit.eu/openlink-sage-project-creating-a-geosparql-benchmark/?utm_source=rss&utm_medium=rss&utm_campaign=openlink-sage-project-creating-a-geosparql-benchmark https://project-hobbit.eu/openlink-sage-project-creating-a-geosparql-benchmark/#respond Mon, 15 Oct 2018 06:00:54 +0000 https://project-hobbit.eu/?p=4416 The success of the HOBBIT benchmarking platform, along with the fact that many RDF / SPARQL systems are readily available for benchmarking by the platform, made us look at the options of using it as part of another R&D project, called SAGE, in which OpenLink is involved.

The project “SAGE: Semantic Geospatial Analytics” aims to tackle the management of big geospatial data, by developing dedicated algorithms, time-efficient storage and querying strategies, extending the GeoSPARQL standard with continuous queries, and developing scalable link discovery approaches for streaming geospatial data. All these efforts will lower the main barriers to harnessing the full power of geospatial data in companies.

One of the initial tasks in SAGE, therefore, is to ensure GeoSPARQL compliance of OpenLink’s Virtuoso Universal Server. In order to automate the compliance test and make it a general test for all SPARQL processing systems, we opted with developing a GeoSPARQL benchmark, as part of the HOBBIT benchmarking platform.

GeoSPARQL Benchmark for the HOBBIT Platform

The current version of the GeoSPARQL compliance benchmark is based on the example validation queries defined in the GeoSPARQL standard [document #11-052r4]. The same test dataset and test queries are used by SPARQLScore / Tester-for-Triplestores in their GeoSPARQL subtest. Even though they do not explicitly test all requirements defined within the standard, they do test a broad range of different requirement types. With this, they provide a quick and simple compliance test for the standard.

GeoSPARQL benchmark against Virtuoso v8.2

Results from running the current version of the GeoSPARQL benchmark against Virtuoso v8.2 Enterprise Edition at the HOBBIT platform

The initial runs of the benchmark showed that Virtuoso v8.2 Enterprise Edition was able to correctly answer most of the benchmark questions: it scored 4/6 (see Figure above). We got the same results with the latest Virtuoso v7.2 Open Source Edition (with enabled GeoSPARQL support), during internal testing.

Since these initial tests, we at OpenLink have been working on extending the GeoSPARQL support to all aspects of the standard, which would initially improve the score on the benchmark and afterwards provide better query performance for geo-enabled queries. In parallel, as part of the R&D efforts in SAGE, our team will be working on extending the GeoSPARQL benchmark to include and thoroughly test all facets and aspects of the standard.

]]>
https://project-hobbit.eu/openlink-sage-project-creating-a-geosparql-benchmark/feed/ 0
HOBBIT Spatial Benchmark V2.0 https://project-hobbit.eu/hobbit-spatial-benchmark-v2-0/?utm_source=rss&utm_medium=rss&utm_campaign=hobbit-spatial-benchmark-v2-0 https://project-hobbit.eu/hobbit-spatial-benchmark-v2-0/#respond Wed, 10 Oct 2018 18:39:23 +0000 https://project-hobbit.eu/?p=4407 A number of real and synthetic benchmarks have been proposed for evaluating the performance of link discovery systems. So far, only a limited number of link discovery benchmarks target the problem of linking geo-spatial entities. However, some of the largest knowledge bases of the Linked Open Data Cloud, such as LinkedGeoData contain vast amounts of spatial information. Furthermore, several systems that manage spatial data and consider the topology of the spatial resources and the topological relations between them have been developed. In order to assess the ability of these systems to handle the vast amount of spatial data and perform the much needed data integration in the Linked Geo Data Cloud, it is imperative to develop benchmarks for geo-spatial link discovery. Thus, in the context of HOBBIT project we have developed the Spatial Benchmark Generator (SPgen). SPgen can be used to test the performance of systems that deal with topological relations proposed by the state of the art DE-9IM (Dimensionally Extended nine-Intersection Model) [1].

The supported topological relations of the model are:

  • Equals
  • Disjoint
  • Touches
  • Contains/Within
  • Covers/CoveredBy
  • Intersects
  • Crosses
  • Overlaps

In the first version of the Spatial Benchmark Generator, we implemented all topological  relations of DE-9IM between LineStrings and we have enriched the second version with the relations between LineStrings and Polygons in a two-dimensional space. SPgen follows the choke point-based approach [2] for benchmark design, i.e., it focuses on the technical difficulties of existing systems and implements tests that address those difficulties to “push” systems to resolve them in order to become better. More specifically we focus on the following choke-points in SPgen:
Scalability: produce datasets large enough to stress the systems under test
Output quality: compute precision, recall and f-measure
Time performance: measure the time the systems need to return the results
SPgen gets as input traces represented as LineStrings and produces a source and a target dataset that implements a specific DE-9IM relation between the source and the target instance.

In this version of SPgen we have three data generators: TomTom, Spaten and DEBS.

  1. TomTom provides a Synthetic Trace Generator developed in the context of the HOBBIT Project, that facilitates the creation of an arbitrary volume of data from statistical descriptions of vehicle traffic. More specifically, it generates traces, with a trace being a list of (longitude, latitude) pairs recorded by one device (phone, car, etc.) throughout one day. TomTom was the only data generator in the first version of SPgen.
  2. Spaten is an open-source configurable spatio-temporal and textual dataset generator, that can produce large volumes of data based on realistic user behavior. Spaten extracts GPS traces from realistic routes utilizing the Google Maps API, and combines them with real POIs and relevant user comments crawled from TripAdvisor. Spaten publicly offers  GB-size datasets with millions of check-ins and GPS traces.
  3. DEBS provides a selection of AIS data collected from the MarineTraffic coastal network. It has been used for the EU H2020 Research Project BigDataOcean and the ACM DEBS Grand Challenge 2018.

A user of SPgen can choose which dataset should be used as a source dataset. The source dataset is identical to the input traces but is expressed in the Well Known Text format (WKT), whereas the target dataset consists of LineStrings or Polygons that are generated from the source dataset in such a way that traces in the target dataset have a specific topological DE-9IM relation with the traces of the source dataset.

To compute the gold standard, we resorted to an appropriate implemented system, namely RADON [8]. RADON was selected because it is a novel approach for rapid discovery of topological relations among geo-spatial resources. RADON was evaluated with real datasets of various sizes and showed that in addition to being complete and correct, it also outperforms the state of the art spatial link discovery systems by up to three orders of magnitude.

SPgen has been integrated into the HOBBIT platform and can be used for benchmarking any system that  is able to identify topological relations.

 

[1] C. Strobl. Encyclopedia of GIS , chapter Dimensionally Extended Nine-Intersection Model (DE-9IM), pages 240245. Springer, 2008.
[2] P. Boncz, T. Neumann, and O. Erling. TPC-H analyzed: Hidden messages and lessons learned from an influential benchmark. In TPC-TC, pages 61–76. Springer, 2013.
[3] M.-A. Sherif, K. Dreßler, P. Smeros, and A.-C. Ngonga Ngomo. RADON – Rapid Discovery of Topological Relations. In AAAI, 2017.

]]>
https://project-hobbit.eu/hobbit-spatial-benchmark-v2-0/feed/ 0
SQA Challenge https://project-hobbit.eu/sqa-challenge/?utm_source=rss&utm_medium=rss&utm_campaign=sqa-challenge https://project-hobbit.eu/sqa-challenge/#respond Mon, 17 Sep 2018 11:18:59 +0000 https://project-hobbit.eu/?p=4387 Question answering (QA) systems have recently become commercially viable (and lucrative) products, thanks to increasing investments and research and the development of intuitive and easy-to-use interfaces. Regular interactions with QA systems have become increasingly frequent and natural and the consumers’ expectations around their capabilities keep growing. Such systems are now available in various settings, devices and languages and the exploding usage in real (non-experimental) settings has boosted the demand for resilient systems, which can cope with high volume demand.

The Scalable Question Answering (SQA) challenge was successfully executed on the HOBBIT platform at the ESWC conference this year, with the aim of providing an up-to-date benchmark for assessing and comparing state-of-the-art-systems that mediate between a large volume of users, expressing their information needs in natural language, and RDF data. In particular, successful approaches to this challenge were able to scale up to big data volumes, handle a vast amount of questions and accelerate the question answering process, so that the highest possible number of questions can be answered as accurately as possible in the shortest time.

The dataset was derived from the award-nominated LC-QuAD dataset, which comprises 5000 questions of variable complexity and their corresponding SPARQL queries over DBpedia. In contrast to the analogous challenge task run at ESWC in 2017, the adoption of this new dataset ensured an increase in the complexity of the questions and the introduction of “noise” in the form of spelling mistakes and anomalies as a way to simulate a noisy real-world scenario in which questions may be served to the system imperfectly as a result, for instance, of speech recognition failures or typing errors.

The benchmark (see our previous post for more details) sends to the QA system one question at the start, two more questions after one minute and continues to send n+1 new questions after n minutes. One minute after the last set of questions is dispatched, the benchmarks closes and the evaluation is generated. Along with the usual measure of precision, recall and F-measure, an additional measure was introduced as the main ranking criterion, the Response Power. This was defined as the harmonic mean of three measures: precision, recall and the ratio between processed questions (an empty answer is considered as processed, a missing answer is considered as unprocessed) and total number of questions sent to the system.

Out of the over twenty teams who expressed an interest in the challenge, three (from Canada, Finland and France) were able to submit their system and present it at the conference (which was a requirement). The final winner was the WDAqua-core1 system with a response power of 0.472, followed by GQA (0.028) and LAMA (0.019).

We would like to thank Meltwater for sponsoring a social dinner for the Question Answering community at ESWC 2018, as well as for providing the 300€ prize for the SQA challenge winner, and Springer for offering a 100€ voucher for the runner-up.

 

 

]]>
https://project-hobbit.eu/sqa-challenge/feed/ 0
HOBBIT @ DEBS Conference 2018 https://project-hobbit.eu/hobbit-debs-conference-2018/?utm_source=rss&utm_medium=rss&utm_campaign=hobbit-debs-conference-2018 https://project-hobbit.eu/hobbit-debs-conference-2018/#respond Mon, 13 Aug 2018 08:36:39 +0000 https://project-hobbit.eu/?p=4371 The DEBS Grand Challenge 2018 was successfully executed on the HOBBIT online platform. Analytical and machine learning algorithms running in a distributed streaming manner are in the focus of Grand Challenge series, which is a  joint event with the annual Distributed Event-Based Systems (DEBS) Conference. During the competition participating  teams have to demonstrate their capabilities to implement their solutions as a distributed stream processing system, which differs the scope of Grand Challenge (e.g. from Kaggle, which technically is batch processing).  AGT International co-organized the Grand Challenge for the second time, developed a benchmark as well as a number tools to assist participants with integration of their solutions into the HOBBIT platform, which helped to increase the number of total submitted systems.

The Grand Challenge the DEBS GC 2018 is rather oriented on machine learning community and focuses on maritime surveillance domain. The anonymized dataset was provided by the MarineTraffic portal and contained about 1 million of tracks (geo coordinates, course, speed and other relevant information) received from automatic identification systems (AIS) that are installed on real ships. Within the challenge analytical systems were expected to sequentially receive AIS track and to give the following predictions for each AIS track: (1) arrival port name and (2) expected arrival time. Using the given predictions, systems were evaluated and ranked as follows: (1) length (in % of the whole trip) is the last sequence of correctly predicted port names; (2) mean absolute error in arrival time prediction (against real arrival time). The overall ranking of the systems was calculated using described ranks (75% impact) as well as systems running time (25% impact).

From organizational point of view the online challenge was decided to be splitted into  training and final phases. The training phase allowed the participants to continuously test and improve their systems using the benchmark accompanied with a small subset of the test data. Participants registered their systems for the periodical training challenge execution and the results were available at the online leaderboards (2 leaderboards – one per the challenge task). The training phase allowed the teams to thoroughly test their systems under realistic conditions and to solve all the technical problems before the final phase of the challenge. The final phase of the challenge included one execution of each task and was held in a non-public mode – leaderboards became publicly available online after the awarding procedure at the DEBS conference.

Technically the challenge was supported at GitHub, where two sample systems (in Java and Python) have been published. Both systems are built around the Hobbit Java SDK and allow participants to debug systems before making a docker container and uploading to the online platform. Sample systems as well as Java SDK have been created by AGT as a result of the experience received during the Grand Challenge 2017 – first DEBS challenge on the HOBBIT platform.

In total more than 15 teams expressed an interest in the challenge, 9 systems from 8 teams were submitted for the final challenge. The winning solution was created by the team from Alexandru Ioan Cuza University of Iaşi, Romania (the team also won the DEBS Grand Challenge 2017) which demonstrated the highest results among other systems (earliness rate is 70%, mean absolute error is about 800 min). Teams from Chungnam National University (Korea), Technische Universität Dresden (Germany), Technion (Israel), NUI Galway (Ireland), University of Illinois Urbana-Champaign (USA), Centre National de la Recherche Scientifique (France) and University of Carthage (Tunisia) presented their solutions at the DEBS Conference during the oral and poster sessions.

The DEBS steering committee as well as the teams that participated in the challenge provided a positive feedback to the HOBBIT platform and expressed their interest to continue using it for further Grand Challenges. The upcoming DEBS Conference 2019 will be in Darmstadt, Germany.

]]>
https://project-hobbit.eu/hobbit-debs-conference-2018/feed/ 0
Virtuoso development progress evaluated with the Data Storage benchmark https://project-hobbit.eu/virtuoso-development-progress-evaluated-with-the-data-storage-benchmark/?utm_source=rss&utm_medium=rss&utm_campaign=virtuoso-development-progress-evaluated-with-the-data-storage-benchmark https://project-hobbit.eu/virtuoso-development-progress-evaluated-with-the-data-storage-benchmark/#respond Wed, 04 Jul 2018 12:57:33 +0000 https://project-hobbit.eu/?p=4336 After Virtuoso took part in the MOCHA 2017 and MOCHA 2018 challenges, and won both, we at OpenLink wanted to benchmark and evaluate a set of different Virtuoso versions using the same benchmark setup, in order to get a comparative analysis and showcase the progress of its development. The benchmark which we set out to use for this was the Data Storage benchmark (DSB) [1][2], developed as part of the HOBBIT project.

The Data Storage benchmark aims to measure how data storage solutions perform with interactive read SPARQL queries, accompanied with a high insert data rate via SPARQL UPDATE queries. This approach mimics realistic application scenarios where read and write operations are bundled together. It also tests systems for their bulk load capabilities.

Test Setup

We benchmarked three Virtuoso commercial versions:

  1. Virtuoso v7.20.3221 (Aug 2017)
  2. Virtuoso v7.20.3227 (Apr 2018)
  3. Virtuoso v8.0

The evaluation consisted of two DSB benchmark setups, which we will refer to as Test 1 and Test 2, accordingly. Their details are presented in the table below.

Test 1 Test 2
Benchmark Data Storage Benchmark Data Storage Benchmark
Query Execution Sequential Parallel
Number of Operations 30.000 30.000
Scale Factor 30 30
Seed 100 100
Time Compression Ratio 1 0.5

The main difference between the two tests is the manner of query execution. The first test uses sequential issuing and execution of the SPARQL queries which comprise the benchmark query mix (SPARQL SELECT and SPARQL INSERT queries), which means that a new query is issued by the benchmark only after the previous query has finished. This approach allows us to test a data storage system for its capability to handle queries when it can use all available resources to respond to a single query at a time.

On the other hand, the second test uses parallel query issuing and execution, where the queries of the benchmark query mix are issued by the benchmark according to a synthetic operation order, which mimics actual user behaviour on an online social network. Here, the operations / queries overlap, which allows us to test a data storage system for its capability to handle a real-world usage load.

The scale factor parameter defines the size of dataset, and its value of 30 means that we use an RDF dataset of 1.4 billion triples. After bulk loading it, the benchmark issues 30.000 SPARQL queries (INSERT and SELECT) which are executed against the system under test. The TCR value determines if the benchmark issues the queries in the same time scale as they are timestamped in the dataset, or speeds them up / slows them down. For the sequential issuing of queries this parameter is irrelevant. The TCR = 0.5 parameter for the parallel task means that the time simulated by the benchmark is slowed down by 50%. This implies that the benchmark issues around 17 queries per second.

For each test and each system, we executed the experiment twice, in order to eliminate the chance of coincidence, the influence of possible current server occupancy and the like, and to get numbers which we can rely on.

The benchmarking took place on our private instance of the HOBBIT Platform. The experiments were executed on a dual Xeon(R) E5-2630@2.33GHz machine with 192GB RAM.

Test 1: Sequential Query Execution

The main DSB KPIs are the average query execution time and the loading time. From the following chart, the achieved Virtuoso progress across versions is more than evident in both aspects: loading and querying performance. Virtuoso v8.0 (red bars) is faster by 14% than the newer v7.2 (green bars), and by almost 50% than the older v7.2 (blue bars) regarding average query execution time, while the differences in loading times are not that large (5%).

The main reasons of this dominance can be seen on the following charts, where we compare different versions of Virtuoso in average query execution time per query type. There are 3 charts, representing complex SELECT queries (Q*), short lookups (S*) and INSERT queries (U*). For all short lookups and updates, and for the majority of complex queries, v8.0 is faster than the other two.

Test 2: Parallel Query Execution

In this set of experiments, the difference between versions is not that large, but it is still obvious that the newer versions are significantly faster. All query executions are slower in this test compared to Test 1, due to the high throughput and having many queries submitted at the same time. Here, we present the same charts, with the same meaning and colors.

Conclusions

The evaluation of a set of different Virtuoso versions using the Data Storage benchmark showed significant improvement of our system through versions. The latest version, Virtuoso v8.0, exhibits largely improved query execution times, compared to Virtuoso v7.2. In the future we plan to run similar experiments using the other HOBBIT benchmarks, in order to showcase other areas of improvement and detect possible optimization goals for future development.

Authors: Mirko Spasić and Milos Jovanovik (OpenLink Software)

 

[1] Milos Jovanovik and Mirko Spasić. “First Version of the Data Storage Benchmark”, May 2017. Project HOBBIT Deliverable 5.1.1.
[2] Mirko Spasić and Milos Jovanovik. “Second Version of the Data Storage Benchmark”, May 2018. Project HOBBIT Deliverable 5.1.2.

 

]]>
https://project-hobbit.eu/virtuoso-development-progress-evaluated-with-the-data-storage-benchmark/feed/ 0
Katana & HOBBIT https://project-hobbit.eu/katana-hobbit/?utm_source=rss&utm_medium=rss&utm_campaign=katana-hobbit https://project-hobbit.eu/katana-hobbit/#respond Thu, 28 Jun 2018 07:52:53 +0000 https://project-hobbit.eu/?p=4330 USU Software AG is a mid-tier business competing in many areas of IT. Based on the insights of several research projects, we are currently improving our big data analytics and machine learning expertise. In the past, we were able to provide solutions for industrial big data analysis, applying machine learning and complex-event-processing technologies on sensor data (historical data and streams) from production and machine tools [1]. E.g. we carried out Germany’s largest project for predictive maintenance at our customer Heidelberger Druckmaschinen AG, predicting the status of more than 10,000 printing machines, analyzing more than 1,500 sensors per machine [2].

In a previous blog post we have described how we used several HOBBIT benchmarks to measure and improve the performance of our machine learning applications. In contrast, this blog post will focus on the influence of HOBBIT on our new Big Data Analytics Platform Katana.

Katana provides a holistic platform for our customers to store and process their data, develop custom machine learning pipelines, visualize results and finally transfer the developed algorithms into their productive environment. This product is one of the first in an evolving market. To provide the best solution for our customers and stay competitive or even top of the market, we need to evaluate and select the best solutions within our technology stack. HOBBIT helped us to evaluate and select the core technology of Katana based on the defined benchmarks, their KPIs and measure results based on the defined gold standards. Furthermore, the development of the web-based graphical user interface of HOBBIT by USU as well as the evaluation of frameworks for identity and access management highly influenced the realization of our own products.

By locally installing our own instance of the HOBBIT platform we created an environment to independently test and evaluate core technologies. This enabled us to measure the performance of storage solutions like databases and distributed file systems based on the Data Storage Benchmark. Furthermore, this helps us to evaluate newly developed algorithms for data processing based on the Machine Learning Benchmarks as well as the Streams Benchmarks.

We see a high potential of the platform in academia as well as industry and expect more developers, solution providers and technology users to get involved in the association, the development of new Benchmarks as well as the further improvement of the platform.

 

[1] https://www.usu.de/de/industrial-smart-service/
[2] https://cdn.usu.de/fileadmin/user_upload/industrial-smart-service/PM_HDM.pdf

 

]]>
https://project-hobbit.eu/katana-hobbit/feed/ 0
ESWC 2018 Challenges https://project-hobbit.eu/eswc-2018-challenges/?utm_source=rss&utm_medium=rss&utm_campaign=eswc-2018-challenges https://project-hobbit.eu/eswc-2018-challenges/#respond Wed, 20 Jun 2018 13:20:38 +0000 https://project-hobbit.eu/?p=4284 The HOBBIT project organized three benchmarking challenges at the ESWC 2018 conference, which took place in Heraklion, Crete in Greece from June 3rd to June 7th, as it aims to bring together and inspire people from academia and industry to come up with fresh and innovative ideas on solving real-life problems concerning Big Linked Data.

We are proud that under the umbrella of HOBBIT, we reran the challenges organized at ESWC 2017; the 2nd edition of MOCHA (Mighty Storage Challenge), the 4th edition of the OKE (Open Knowledge Extraction) challenge and the SQA (Scalable Question Answering) challenge, an offspring of the well-known QALD (Question Answering over Linked Data) challenge.

Challenges consisted of multiple benchmarking tasks prepared by the HOBBIT partners and participants were invited to submit both a system that tackles one, or more, of a challenge’s tasks and a short, 5-page, paper describing their system. Specifically, the following benchmarking tasks were offered by the three HOBBIT challenges (the tasks are described in detail on the challenges’ websites):

MOCHA Task 1: RDF Data Ingestion
Task 2: Data Storage Benchmark
Task 3: Versioning RDF Data
Task 4: Faceted Browsing Benchmark
OKE Task 1: Focused Named Entity Identification and Linking
Task 2: Broader Named Entity Identification and Linking
Task 3: Relation Extraction
Task 4: Knowledge Extraction
SQA Task 1: Large-Scale Question answering over RDF

The papers describing the systems were peer-reviewed by experts and the most promising systems were invited to participate in each challenge. In the MOCHA challenge 2 systems took part, in the OKE challenge 1 system and in the SQA challenge 3 systems. The HOBBIT partners also offered additional baseline systems, against which those of the participants were evaluated.

The HOBBIT platform successfully supported the running of the three challenges. The datasets pertaining to the benchmarking tasks were integrated into the platform and the evaluation of the competing systems was performed using the platform. Importantly, the challenges provided to the HOBBIT consortium another opportunity to identify and fix bugs in the platform, improve its documentation, as well as to stress-test the platform.

A session dedicated to the three HOBBIT challenges took place on the 5th of June as part of the main ESWC 2018 conference. This ensured great exposure for HOBBIT and competing systems, as the session attracted around 22 attendees. For each of the challenges an overview presentation was given by the challenge organizers and also the authors of the competing systems were given the opportunity to briefly present their work.

The icing on the cake came during the ESWC 2018 closing ceremony, on the evening of the 7th of June, where the winners of the three challenges were announced and the winner certificates were handed to them. Also the winners received prize money for their achievement by the HOBBIT sponsors USU and Meltwater.

USU sponsored the MOCHA and OKE challenges

Meltwater sponsored SQA challenge

USU sponsored the MOCHA and OKE challenges with 250€ each, while Meltwater sponsored SQA challenge with 300€ and additionally organized a social dinner on the evening of the 5th of June for the question answering community.

The winners of the three HOBBIT challenges were:

MOCHA – Milos Jovanovik and Mirko Spasić, Benchmarking Virtuoso 8 at the Mighty Storage Challenge 2018: Training Results (prize: 250€ from USU + 100€ from Springer)

MOCHA winner

OKE  – Hector Cerezo-Costas and Manuela Martın-Vicente, Relation Extraction for Knowledge Base Completion: A supervised approach (prize: 250€ from USU + 100€ from Springer)

SQA (winner) – Dennis Diefenbach, Kamal Singh and Pierre Maret, On the scalability of the QA system WDAqua-core1 (prize: 300€ from Meltwater)

SQA winner

SQA (second place) – Elizaveta Zimina, Jyrki Nummenmaa, Kalervo Järvelin, Jaakko Peltonen, Kostas Stefanidis and Heikki Hyyrö, GQA: Grammatical Question Answering for RDF Data (prize: 100€ from Springer)

SQA second place

Photos from the challenges workshop session (5th of June)

              

]]>
https://project-hobbit.eu/eswc-2018-challenges/feed/ 0
Generating Transport Data https://project-hobbit.eu/generating-transport-data/?utm_source=rss&utm_medium=rss&utm_campaign=generating-transport-data https://project-hobbit.eu/generating-transport-data/#respond Wed, 16 May 2018 06:30:09 +0000 https://project-hobbit.eu/?p=4065 The growing amount of navigation services’ users provide many opportunities to improve driving conditions by making a better use of the road network. However, collecting large quantities of such data can be difficult due to both costs and privacy concerns. TomTom collects large amounts of anonymized traffic data, respecting the user’s right to privacy: https://www.tomtom.com/en_gb/privacy/. This allows us to generate statistical descriptions that can be used to generate a large number of synthetic trips with similar properties to real data, but without the associated issues. In this post we’ll explore our approach to creating such a generator.

We’ll start by defining some terms that will be used throughout this post:

  • Point: A location (latitude and longitude), a speed and a time stamp.
  • Trip: A sequence of points from one origin to one destination with a fixed interval between consecutive points.
  • Trace: A sequence of one or more trips recorded by the same device, all with the same time interval between points.

Each trace is generated from a few parameters sampled from the the following distributions:

  • Trip origin: A 2D histogram that divides the map into cells where trips start.
  • Trip destination: A 2D histogram that divides the map into cells where trips end.
  • Trace begin year, month, weekday and minute of day.
  • Number of trips per trace.
  • Minutes between consecutive trips.
  • Trip air line (or great-circle) distance.
  • Seconds between consecutive points.

We start the process of generating a trace by defining the number of the trips in the trace and the seconds between points by sampling from their distributions. The first trip’s starting time and location are also sampled from the appropriate distributions and the trip is simulated.

The trip generator

Figure 1. The trip generator.

The process used to generate a single trip is illustrated in Figure 1. Given a node on the map as the origin of the trip we sample from the trips destinations and the air line distance distributions to choose a destination for the trip. The trip simulator then uses the map to compute the fastest route between the two nodes and creates points along it with a fixed frequency, defined by the parameter seconds between points. The trip simulator is also responsible for simulating noise in the coordinates. The output is a list of points, similar to Figure 2.

A small segment of one generated trace

Figure 2. A small segment of one generated trace.

The points are then appended to the end of the trace, which aggregates the points for all its trips. The process is then repeated for the remaining trips, however both the start node and time are defined by the end of the previous trip. When all trips have been generated the trace is represented as RDF (Turtle) and the next trace can be generated.

Comparison of real (left) and synthetic (right) data for the region of Berlin, Germany

Figure 3. Comparison of real (left) and synthetic (right) data for the region of Berlin, Germany.

As we can see in Figure 3, mimicking traffic with this statistical approach provides results with quite similar densities. Further analysis and explanations on the generator and its results can be found in [1].

 

[1] Bösche K., Sellam T., Pirk H., Beier R., Mieth P., Manegold S. (2013) Scalable Generation of Synthetic GPS Traces with Real-Life Data Characteristics. In: Nambiar R., Poess M. (eds) Selected Topics in Performance Evaluation and Benchmarking. TPCTC 2012. Lecture Notes in Computer Science, vol 7755. Springer, Berlin, Heidelberg.

 

 

]]>
https://project-hobbit.eu/generating-transport-data/feed/ 0