Link Discovery Task at the OAEI OM 2018 Workshop
The number of datasets published in the Web of Data as part of the Linked Data Cloud is constantly increasing. The Linked Data paradigm is based on the unconstrained publication of information by different publishers, and the interlinking of Web resources across knowledge bases. In most cases, the cross-dataset links are not explicit in the dataset and must be automatically determined using Instance Matching (IM) and Link Discovery tools amongst others. The large variety of techniques requires their comparative evaluation to determine which one is best suited for a given context. Performing such an assessment generally requires well-defined and widely accepted benchmarks to determine the weak and strong points of the proposed techniques and/or tools.
A number of real and synthetic benchmarks that address diﬀerent data linking challenges have been proposed for evaluating the performance of such systems. So far, only a limited number of link discovery benchmarks target the problem of linking geo-spatial entities.
However, some of the largest knowledge bases on the Linked Open Data Web are geospatial knowledge bases (e.g., LinkedGeoData with more than 30 billion triples). Linking spatial resources requires techniques that diﬀer from the classical mostly string-based approaches. In particular, considering the topology of the spatial resources and the topological relations between them is of central importance to systems driven by spatial data.
We believe that due to the large amount of available geospatial datasets employed in Linked Data and in several domains, it is critical that benchmarks for geospatial link discovery are developed.
The proposed Task entitled “Link Discovery Task” is accepted at the OAEI OM 2018 Workshop at ISWC 2018. OM workshop conducts an extensive and rigorous evaluation of ontology matching and instance matching (link discovery) approaches through the OAEI (Ontology Alignment Evaluation Initiative) 2018 campaign.
The aim of the Task is to test the performance of Link Discovery tools that implement string-based as well as topological approaches for identifying matching spatial entities. The different frameworks will be evaluated for both accuracy (precision, recall and f-measure) and time performance.
Q & A
|Deadline for the submission of papers||June 4th|
|Deadline for the notification of acceptance/rejection||June 27th|
|Early registration deadline||June 29th|
|Workshop camera ready copy submission||June 31st|
|(Preliminary) datasets available||June 15th|
|Preparation phase ends and final datasets||July 15th|
|participants register their tool (mandatory)||July 31st|
|Execution phase ends and participants submit final versions of their tools. SEALS tracks (zip file, e.g., LogMap.zip). HOBBIT tracks (via platform).||August 31st|
|Evaluation phase ends and results are available. SEALS and HOBBIT tracks.||September 30th|
|Preliminary version of system papers due. Submit PDF paper (e.g., LogMap_prelim.pdf).||October 3rd|
|Ontology matching workshop.||October 8th|
|Final version of system papers due. Submit a PDF (e.g., LogMap_final.pdf) paper.||November 1st|
Tasks and Training Data
We will use TomTom datasets in order to create the appropriate benchmarks. TomTom datasets contain representations of traces (GPS fixes). Each trace consists of a number of points. Each point has time stamp, longitude, latitude and speed (value and metric). The points are sorted by timestamp of the corresponding GPS fix (ascending).
This version of the challenge will comprise the following tasks:
- Task 1 (Linking) will measure how well the systems can match traces that have been altered using string-based approaches along with addition and deletion of intermediate points.As the TomTom dataset only contains coordinates and in order to apply string-based modifications based on LANCE we have replaced a number of those with labels retrieved from Google Maps Api, Foursquare Api and Nominatim Openstreetmap Api. This task also contains changes on date format and changes on coordinate formats.
- Task 2 (Spatial) measures how well the systems can identify DE-9IM (Dimensionally Extended nine-Intersection Model) topological relations. The supported spatial relations are the following: Equals, Disjoint, Touches, Contains/Within, Covers/CoveredBy, Intersects, Crosses, Overlaps and the traces are represented in Well-known text (WKT) format.For each relation, a different pair of source and target dataset will be given to the participants.
 Saveta, E. Daskalaki, G. Flouris, I. Fundulaki, and A. Ngonga-Ngomo. LANCE: Piercing to the Heart of Instance Matching Tools. In ISWC, 2015.
Registration & Submission
Participants may register their tool (mandatory) using this form (requires a google account and a valid email)
Submit PDF paper (e.g., LogMap_prelim.pdf). Please use this form (requires a google account and a valid email).
The final version of system papers may submit as a PDF (e.g., LogMap_final.pdf) paper. Please use this form (requires a google account and a valid email).
The following systems have been tested
|Tool||Institution||Country||Contact person(s)||Task to participate|
|University of Lisbon /
Instituto Gulbenkian de Ciencia /
University of Illinois at Chicago
|Portugal / USA||Daniel Faria,
|Linking & Spatial|
|Rapid Discovery of
|AKSW||Germany||Mohamed Ahmed Sherif,
|Silk||National and Kapodistrian
University of Athens
|Greece||Despina Athanasia Pantazi, Panayiotis Smeros||Spatial|
Link Discovery – Linking results
|SANDBOX (100 instances)|
|MAINBOX (5000 instances)|
Link Discovery – Spatial results
The results can be found in HOBBIT platform: here (login as guest)
- Pavel Shvaiko (Main contact)
Informatica Trentina, Italy
E-mail: pavel [dot] shvaiko [at] infotn [dot] it
- Jérôme Euzenat
INRIA & Univ. Grenoble Alpes, France
- Ernesto Jiménez-Ruiz
University of Oslo, Norway
- Michelle Cheatham
Wright State University, USA
- Oktie Hassanzadeh
IBM Research, USA
- Alsayed Algergawy, Jena University, Germany
- Manuel Atencia, INRIA & Univ. Grenoble Alpes, France
- Zohra Bellahsene, LRIMM, France
- Marco Combetto, Informatica Trentina, Italy
- Valerie Cross, Miami University, USA
- Warith Eddine Djeddi, LIPAH & LABGED, Tunisia
- Jérôme David, University Grenoble Alpes & INRIA, France
- Gayo Diallo, University of Bordeaux, France
- Zlatan Dragisic, private individual, Sweden
- Alfio Ferrara, University of Milan, Italy
- Wei Hu, Nanjing University, China
- Antoine Isaac, Vrije Universiteit Amsterdam & Europeana, Netherlands
- Ryutaro Ichise, National Institute of Informatics, Japan
- Daniel Faria, Instituto Gulbenkian de Ciéncia, Portugal
- Marouen Kachroudi, Université de Tunis El Manar, Tunis
- Patrick Lambrix, Linköpings Universitet, Sweden
- Vincenzo Maltese, University of Trento, Italy
- Fiona McNeill, University of Edinburgh, UK
- Christian Meilicke, University of Mannheim, DE
- Peter Mork, MITRE, USA
- Andriy Nikolov, Open University, UK
- Axel Ngonga, University of Leipzig, Germany
- Catia Pesquita, University of Lisbon, Portugal
- Umberto Straccia, ISTI-C.N.R., Italy
- Ondrej Svab-Zamazal, Prague University of Economics, Czech Republic
- Cássia Trojahn, IRIT, France
- Ludger van Elst, DFKI, Germany