Benchmark II – Analysis & Processing

HOBBIT provides benchmarks for the linking and analysis of Linked data. The linking benchmark tests the performance of instance matching methods while the analytics benchmark tests the performance of machine learning methods (supervised and unsupervised) for data analytics.

  • Linking:  Linking spatial resources requires techniques that differ from the classical mostly string-based approaches. In particular, considering the topology of the spatial resources and the topological relations between them is of central importance to systems that manage spatial data.
    Due to the large amount of available geo-spatial datasets employed in Linked Data and in several domains, it is critical that benchmarks for geo-spatial link discovery are developed.  We propose two benchmarks that deal with link discovery for spatial data:

    1. The Linking Benchmark is based on SPIMBENCH, to test the performance of Instance Matching tools that implement string-based approaches for identifying matching entities
    2. The Spatial Benchmark can be used to test the performance of systems that deal with topological relations proposed in the state of the art DE-9IM (Dimensionally Extended nine-Intersection Model) model.
  • Machine Learning on Structured Data: Given graph data by USU and AGT, we investigate supervised and unsupervised machine learning methods. Those methods take resources in the graph representing events as input and learn descriptions for abnormal events and problems. The gold standard for supervised learning methods are derived from previous abnormal events and problems, e.g. at USU (a task related to the SAKE project). Unsupervised machine learning methods are based on smart metering data provided by AGT and access measurement resources in the graph structures in order to predict the energy consumption. Optimal models will be learned for individual households for which additional metadata in the graph structure will be used.