QALD-9 Challenge – ISWC 2018

The 4th International Workshop on Natural Language Interfaces for Web of Data (NLIWoD) &  9th Question Answering over Linked Data challenge

Workshop Overview

This workshop is a joint event of two active communities in the area of interaction paradigms to Linked Data: NLIWOD4 and QALD-9. NLIWOD, a workshop for discussions on the advancement of natural language interfaces to the Web of Data, has been organized three-times within ISWC, with a focus on soliciting discussions on the development of question answering systems. QALD is a benchmarking campaign powered by the H2020 project HOBBIT (project-hobbit.eu) including question answering over (Big) linked data, has been organized as a challenge within CLEF, ESWC and ISWC. This joint workshop hopes to attract people from the two communities in order to promote active collaboration, to extend the scope of currently addressed topics, and to foster the reuse of resources developed so far. Furthermore, we offer an challenge – QALD-9 – where users are free to demonstrate the capabilities of their systems using the provided online benchmark platform. Furthermore, we want to broaden the scope of this workshop series to dialogue systems and chatbots as increasingly important business intelligence factors.

Challenge Overview

In addition to participating with innovative research, participants are cordially invited to participate with Question Answering systems. NLIWoD incorporates the 9th Question Answering over Linked Data (QALD) where users can display the capabilities of their systems using the provided online benchmarking platform GERBIL QA support by the H2020 project HOBBIT.

Q&A

1. Do I need to submit a paper?
 => Challenge participants do not have to submit a paper but are free to choose to submit a workshop paper to NLIWOD to present their QALD Challenge entry. (Must be submitted within the workshop deadlines)
2. Do I need to be registered for ISWC?
=> Yes.
3. Where can I find more information about the last series?
=> Information about the last QALD challenge can be found at the HOBBIT website.

A more analytic description of the NLIWoD workshop and the QALD-9 challenge can be found here.

Contact Information

In case of questions, please contact NLIWoD4@easychair.org

Important Dates

Workshop related

Workshop papers due: June 15, 2018 (extended)
Notification of accepted workshop papers: June 27, 2018
Materials for the USB sticks: July 31, 2018
Publication of workshop proceedings: August 15, 2018
Workshops held: October 8-9, 2018

Challenge related

Training data available:  April 31st, 2018
Integration of training data to the GERBIL QA platform by: May 1st, 2018
System submission due: September 1st, 2018
 System integration with GERBIL QA done: September 15th, 2018
Publication of test data: October 8th, 2018
Results on test data: October 9th, 2018

Tasks & Training Data

System integration into the platform

To integrate your system with our benchmarking platform please follow our GERBIL QA instructions. We also provide you with a Java-based template for the webservice integration with GERBIL QA

In case of questions, please contact Ricardo Usbeck <ricardo.usbeck AT uni-paderborn.de>.

We will use the QALD Macro F-measure as comparison criteria on the test dataset. Note, we will test your system in English plus other available languages.

Datasets

Task 1 – Multilingual QA over DBpedia

Train dataset: TBA
Test dataset: TBA

Given the diversity of languages used on the web, there is an impeding need to facilitate multilingual access to semantic data. The core task of QALD is thus to retrieve answers from an RDF data repository given an information need expressed in a variety of natural languages.

The underlying RDF dataset will be DBpedia 2016-10. The training data will consist of more than 250 questions compiled and curated from previous challenges. The questions will be available in 3 to 8 different languages (English, Spanish, German, Italian, French, Dutch, Romanian, Hindi and Farsi), possibly with the addition of three further languages (Korea and Brazilian Portuguese). Those questions are general, open-domain factual questions, for example:

(en) Which book has the most pages?
(de) Welches Buch hat die meisten Seiten?
(es) Que libro tiene el mayor numero de paginas?
(it) Quale libro ha il maggior numero di pagine?
(fr) Quel livre a le plus de pages?
(nl) Welk boek heeft de meeste pagina’s?
(ro) Ce carte are cele mai multe pagini?

The questions vary with respect to their complexity, including questions with counts (e.g., How many children does Eddie Murphy have?…), superlatives (e.g., Which museum in New York has the most visitors? ), comparatives (e.g., Is Lake Baikal bigger than the Great Bear Lake? ), and temporal aggregators (e.g., How many companies were founded in the same year as Google? ). Each question is annotated with a manually specified SPARQL query and answers.

Data creation: The test dataset will consist of 50 to 100 manually compiled similar questions. We plan to compile those from existing, real-world question and query logs, in order to provide unbiased questions expressing real-world information needs which will then be manually curated to ensure a high quality standard. Existing methodology for selecting queries from query logs has been shown to indeed be able to retrieve prototypical queries. We have seen more than 30 submitted systems over the course of the last QALD challenges attracting systems for most languages.

Task 2: English question answering over Wikidata – (Will happen if enough participants register)

Train dataset: TBA
Test dataset: TBA

Another new task introduced this year will use a public data source Wikidata (https://www.wikidata.org/) as a target repository. The training data will include 100 open-domain factual questions compiled from the previous iteration of Task 1. In this task, the questions originally formulated for DBpedia should be answered using Wikidata. Thus, your systems will have to deal with a different data representation structure. This task will help to evaluate how generic your approach is and how easy it is to adapt to a new data source. Note that the results obtained from Wikidata might be different to the answers to the same queries found in DBpedia.

Data creation: This task was designed in the context of the DIESEL project (https://diesel-project.eu/). The training set contains 100 questions taken from the Task 1 of the QALD-6 challenge. We formulated the queries to answer these questions from Wikidata and generated the gold standard answers using them. For this task, we use the Wikidata dump from 09-01-2017 (https://dumps.wikimedia.org/wikidatawiki/entities/20170109/). The Wikidata dataset used to create this benchmark can be found on HOBBIT’s ftp server and the Docker image for running this data with Blazegraph can be found in metaphacts’ Docker Hub.

Registration

To participate in QALD-9 you need to fill out the registration form , found in http://2018.nliwod.org/challenge.

Program & Accepted Papers

Challenge Session
Sunday morning, October 9th, 2018
9.00 – 12.30
9.00 – 10.30 Introduction: Key-Sun Choi (5 min)

Keynote: Peter F. Patel-Schneider (30 min + 10 min Q&A)

Richard Frost and Shane Peelar An Extensible Natural-Language Query Interface to an Event-Based Semantic Web (30min  + 10 min)

Younggyun Hahm, Jiho Kim, Sangmin An, Minho Lee and Key-Sun Choi Chatbot Who Wants to Learn the Knowledge: KB-Agent (15min + 10min)

10:30 – 11:00 Coffee Break
11.00 – 12.30 QALD 9 Challenge Overview and Evaluation (20min) presented by Muhammad Saleem

Jiho Kim, Sangha Nam and Key-Sun Choi Open Relation Extraction by Matrix Factorization and Universal Schemas (20min  + 10 min) 

Best Paper: Kyriaki Zafeiroudi, Leah Eckman and Rebecca Passonneau Testing a Knowledge Inquiry System on Question Answering  (30min  + 10 min)

Organization

Organizers

Program Committee

  • Kuldeep Singh,  Fraunhofer, Germany
  • Ria Hari Gusmita, Syarif Hidayatullah State Islamic University Jakarta, Indonesia
  • Vanessa Lopez, IBM, Ireland
  • Grigorios Tzortzis, NCSR Demokritos, Greece
  • Dennis Diefenbach, University Jean Monet, France
  • Edgard Marx, Leipzig University of Applied Sciences (HTWK), Germany
  • Roberto Garcia, Universitat de Lleida, Spain
  • Amrapali Zaveri, Maastricht University, Netherlands
  • Giorgos Giannopoulos, Imis Institute, “Athena” R.C., Greece
  • Raghava Mutharaju, GE Global Research, USA
  • Varish Mulwad, GE Global Research, USA
  • Subhabrata Mukherjee, Max Planck Institute for Informatics, Germany

——————————–
CALL FOR PAPERS
——————————–

The 4th International Workshop on Natural Language Interfaces for Web of Data (NLIWoD) & 9th Question Answering over Linked Data challenge

Website: http://2018.NLIWoD.org/
Date: OCTOBER 9th, 2018 MONTEREY, CALIFORNIA, USA

The primary goal of this workshop is to bring together experts on the use of natural-language interfaces (NLI) for answering questions especially over the Web of Data. In addition, NLIWoD provides a venue for the popular Question Answering over Linked Data challenge, which is now in its 9th instalment (QALD-9).

——————————–
#Workshop
——————————–
Topics of interest include but are not limited to:
* NEW focus! Dialogue systems
* NEW focus! Personal assistants
* NEW focus! Chatbots (over Linked Data)
* Question answering over Linked Data
* Deep Learning for Natural Language Interaction
* Self-wiring systems natural language systems
* Benchmarking natural language interfaces
* Term matching and entity disambiguation
* Browsing Linked Data
* Indexing for question answering
* SPARQL query pattern generation
* Schema-agnostic query generation
* Discovery of Linked Data sources
* Endpoint profiling
* Dealing with data and schema heterogeneity
* Providing justifications of answers and conveying trust
* Knowledge base design for question answering
* Language resources and NLP software for question answering
* Reasoning for question answering
* Natural language querying of RDF exposed as Linked Data
* Natural language querying of Web services
* User feedback and interaction

——————————–
Important dates
——————————–
* Workshop papers due: June 15, 2018 (extended)
* Notification of accepted workshop papers: June 27, 2018
* Materials for the USB sticks: July 31, 2018
* Publication of workshop proceedings: August 15, 2018
* Workshops held: October 8-9, 2018

——————————–
## Submission Instructions
——————————–
NLIWoD solicits the submission of original research articles in two types:
* Full articles submissions (up to 16 pages) must describe substantial and original work.
* Short articles submissions (up to 8 pages) must describe an original work which may present
+ A small, focused contribution,
+ A work in progress, or
+ An interesting application case.
* Demos and posters presentations (up to 4 pages)
* All submissions must be in English and in formatted the style of the Springer * Publications format for Lecture Notes in Computer Science (LNCS). For details on the LNCS style, see Springer’s Author Instructions.
* All articles are to be submitted via EasyChair https://easychair.org/conferences/?conf=nliwod4
* All submissions will be peer-reviewed by the Program Committee of the workshop.
* Submissions do not need to be anonymous.

We encourage HTML submissions but are currently bound to the ISWC proceedings decision, which is still open and thus demands a PDF of your HTML page as well which needs to be formatted according to LNCS. For useful information about HTML submissions, please see https://2018.eswc-conferences.org/call-for-papers/

——————————–
# Challenge
——————————–
In addition to participating with innovative research, participants are cordially invited to participate with Question Answering systems. NLIWoD incorporates the 9th Question Answering over Linked Data (QALD) where users can display the capabilities of their systems using the provided online benchmarking platform GERBIL QA support by the H2020 project HOBBIT.

——————————–
## Important Dates
——————————–
– Training data available April 31st, 2018
– Integration of training data to the GERBIL QA platform by May 1st, 2018
– System submission due: September 1st, 2018
– System integration with GERBIL QA done: September 15th, 2018
– Publication of test data: October 8th, 2018
– Results on test data:October 9th, 2018

——————————–
## Registration
——————————–
Please follow this link: https://goo.gl/forms/kGc7ix2VMZkEBzYh2

——————————–
## System integration into the platform
——————————–
To integrate your system with our benchmarking platform please follow our GERBIL QA instructions: https://github.com/dice-group/gerbil/wiki/Question-Answering#web-service-interface

We also provide you with a Java-based template for the webservice integration with GERBIL QA:
https://github.com/dice-group/GerbilQA-Benchmarking-Template

We will use the QALD Macro F-measure as comparison criteria on the test dataset.
Note, we will test your system in English plus other available languages.

——————————–
## FAQ
——————————–
1. Do I need to submit a paper?
=> Challenge participants do not have to submit a paper but are free to choose to submit a workshop paper to NLIWoD to present their QALD Challenge entry. (Must be submitted within the workshop deadlines)
2. Do I need to be registered for ISWC?
=> Yes.
3. Where can I find more information about the last series?
=> Information about the last QALD challenge can be found at the HOBBIT website.

——————————–
# Organizers
——————————–
– Key-Sun Choi (KAIST, KR)
– Jin-Dong Kim (Database Center for Life Science, ROIS, JP)
– Axel-Cyrille Ngonga Ngomo (Paderborn University, DE)
– Ricardo Usbeck (Paderborn University, DE)

——————————–
# Contact Information
——————————–
In case of questions, please contact NLIWoD4@easychair.org

TBA