HOBBIT: Quo Vadis?

HOBBIT is now almost one year old. Within the 2nd plenary meeting, which took place in Leipzig  between September 26th and 27th, we recapitulated the current state of affairs and have been pushing towards converting the results of this plenary into a reality over the last months. Overall, the plenary showed that we are on schedule to

  • Release the first version of the HOBBIT by February 2017
  • Get challenges running and have first results by July-August 2017
  • Get the association running by December 2017

We provide more details in the following.

The Plenary

The aim of the plenary was to summarize the efforts carried out so far. We presented the efforts made for each work package and planned the next steps both for the immediate as well as for HOBBIT’s long term goals. In summary, we made decisions on essential technical and practical matters, and also made some initial progress concerning the creation of the HOBBIT association.

We are proud to announce that we will be organizing the first HOBBIT evaluation campaigns:

  • the DEBS Grand Challenge (Holistic Benchmarking of Big Linked Data) and
  • The Instance Matching Benchmark Challenge for Spatial Data at the OAEI workshop at ISWC

Both challenges will take place in  2017.

Three more challenges are under preparation and will be announced soon on HOBBIT’s website (https://project-hobbit.eu/challenges/). Furthermore, an alpha version of the HOBBIT Data Generators is ready. Currently we can generate Twitter, molding machine data as well as transportation data using the aforementioned generators. Specifically, Twitter data is synthesized in a way that resembles the word and time distribution of real tweets. These distributions were computed based on a collection of tweets downloaded directly from Twitter, allowing the generator to mimic the characteristics of real-life tweets. Moreover, the Twitter data is annotated using a simple ontology. The generated transportation data follows a very simple format, similar to that used by our partner TomTom, and is actually a collection of GPS fixes. Each GPS fix consists of a timestamp, the GPS coordinates (longitude and latitude) and the traveling speed. The molding machine data generator produces data that resembles the real-life data captured by sensors deployed on injection molding machines, like those provided by our partner Weidmüller. Each generated data instance is a 120-dimensional vector, corresponding to various parameters of the production process, which, in addition, is timestamped and annotated using an IoT ontology.

We agreed that the transition from structured to unstructured data offers an added value, since in this way we generate large amounts of data and we can test throughput. The correlation of results with existing datasets, ensures that we can predict the performance of the systems on real data.

The association

We believe that the HOBBIT association will play a key role in disseminating and extending the HOBBIT benchmarking platform (including benchmarks, the KPIs, use cases and datasets). Towards this direction, we will initiate a new survey to gather information from the industry and other potential partners on frameworks used to  evaluate their software.

The goal of the HOBBIT association will be  to provide industry-grade benchmarks for Big Linked Data (BLD). The HOBBIT association intends to offer various types of membership depending on companies’ specific needs. The HOBBIT association will work, hand in hand with other European benchmarking associations such as (e.g. LDBC).The association will provide periodic open reports comparing system performances for each of the BLD steps. The association will be financed via member fees, donations and the implementation of dedicated benchmarks for companies. These financial resources will be used to maintain the platform and organize further challenges for BLD even after the completion of the HOBBIT project.

The team
The 2nd plenary meeting team

