Benchmark IV – Visualisation & Services

HOBBIT provides benchmarks for the Visualization and Service part of the Big Data Value Chain. The focus is on benchmarks with well defined metrics that do not involve users. More specifically, we are working on benchmarks regarding a) question answering and b) faceted browsing. The project does not benchmark user interfaces themselves but focus on providing performance and accuracy measurements for approaches used in interfaces.

  • The question answering benchmark focusses on some of the challenges faced by systems designed to provide correct answers to natural language queries, on the basis of one or more structured datasets (knowledge bases). Here we have concentrated on three challenges (choke points): multilinguality (answering questions formulated in multiple languages), source heterogeneity (requiring the combination of structured datasets and free text as knowledge sources) and scalability (the performance of a system in answering questions at an increasing pace). Three novel datasets are used as gold-standard sources of question/answer couples. Performance scores are assigned following the system’s computation of answers to these sets of natural language questions and are measured in terms of precision, recall and F-measure.
  • The benchmark on Faceted Browsing follows a transition centered approach. We consider different choke points for a system underlying a solution for faceted browsing. Each choke point is given by a certain type of transition from one state to another, where a state is defined by its active facet selections and the corresponding solution space. A list of browsing scenarios checks the performance of participating systems on the different types of transitions. Performance is measured in a query-per-second score, as well as by computing accuracy, recall and F1-score of retrieved answers. Next to the performance on these instance retrievals, we also check performance on the computation of so-called facet counts, which estimate the number of remaining instances for future facet selection.