Architecture of the HOBBIT Platform

During the runtime of the HOBBIT project, we have collected several requirements from the community which should be fulfilled by a Big Linked Data benchmarking platform. All these requirements influenced the final architecture of our platform. However, this blog post will focus on its central functionality: we want to offer a platform that enables any developer to upload and benchmark their distributed, Big Linked Data systems in a fair and comparable way. With this main goal in mind, we developed the architecture of the platform as shown in figure below.

Overview of the platform components.

Platform components are marked blue while the orange boxes represent the components of an individual benchmark. The system being benchmarked is marked green:

  • The platform controller is the central component of the HOBBIT platform, coordinating the interaction of other components where needed. This mainly includes the handling of requests that come from the front end component, the starting and stopping of benchmarks, the triggering of the analysis component and the observing of the health status of the underlying server cluster.
  • The storage component contains the experiment results. It comprises two Docker containers—a triple store that uses the HOBBIT ontology to describe the results and a Java program that handles the communication between the message bus and the triple store.
  • The front end component handles the interaction with the user. It offers different functionalities to different user groups. Therefore it interacts with the user management component that allows different roles for authenticated users, as well as a guest role for unauthenticated users.
  • The analysis component is triggered after an experiment has been carried out successfully. Its task is to enrich the benchmark results by combining them with the features of the benchmarked system and the data or task generators. These combinations can lead to additional insights, e.g., on the strengths and weaknesses of a certain system.
  • The platforms repository contains Docker images and metadata of the uploaded benchmarks and systems. If a user wishes to benchmark her system, the images of the benchmark and the system are retrieved from the repository and instantiated. The platform controller is the only component that has direct access to the Docker daemon. If another component, e.g., a benchmark controller, needs to start a Docker container, it has to send a request to the platform controller containing the image name and parameters. Thus, the platform controller offers the central control of commands that are sent to the Docker daemon, which increases the security of the platform while still enabling the benchmark and the benchmarked system to spawn additional containers. This is an important feature for the support of benchmarking distributed systems.

The communication between the platform components is mainly carried out using RabbitMQ. However, while the benchmark controller (the Docker container of the benchmark that is created by the platform and might spawn additional containers) and the system adapter (the Docker container that spawns the system and implements the API of the benchmark) have to communicate with the platform controller via RabbitMQ message queues, the benchmark and the benchmarked system can use other communication technologies as well. Developers using Apache Storm or Apache Kafka for benchmarking a system or inside their system are free to do so. For example, the BigDataEurope project packaged Apache Hadoop and Apache Spark clusters into Docker images, which enables the benchmarking of solutions based on YARN MapReduce or Apache Spark jobs on the HOBBIT platform.

A more detailed description of the architecture can be found in the deliverable D2.1 of the HOBBIT project. A description of the first version, released in February 2017, can be found in D2.2.1 and the first update and maintenance report is available as D2.3.1. We have already received substantial feedback for our platform and are currently developing its second version, focussing on the improvement of the user interaction experience.

Spread the word. Share this post!

Leave A Reply

Your email address will not be published. Required fields are marked *