Outcomes

During the datathon, participants will:

Generate and publish their own linguistic linked data from some existing data sources or existing tools.
Apply Linked Data principles and Semantic Web technologies (Ontologies, RDF, Linked Data) into the field of language resources.
Use the principal models for representing Linguistic Linked Data, in particular Ontolex-Lemon and Web Annotation and knowledge representation vocabularies such as SKOS and OWL.
Gather experiences with terminology resources developed for or used in the Linguistic Linked Open Data context, such as lexvo, lexinfo, OLiA and GOLD.
Learn about multilingual knowledge bases and entity linking against resources from the Web of Data, e.g., DBpedia or BabelNet.
Learn about potential benefits and applications of linguistic linked data for specific use cases.

Topics

During the datathon, seminars will be organised to cover topics such as:

Ontologies and Linked Data
The Lexicon Model for Ontologies (Ontolex-Lemon)
Integrating documents, annotations and NLP tools NLP with Linked Data and RDF using Web Annotation and NIF
Guidelines for RDF generation of Language Resources
Methodologies for Linked Data publication of Language Resources
Multilingual Word Sense Disambiguation and Entity Linking
Use and Applications of Linguistic Linked Data
Metadata and Licenses for Linguistic Linked Data

With the objective of avoiding passive learning, the program of the summer datathon will contain three types of sessions:

1. Seminars

2. Practical sessions

3. Hacking sessions

Participants will be invited to propose a “miniproject” related to the topic and to bring to the datathon some dataset of linguistic data produced by their organizations in order to work on it during the hacking sessions and transform it into linked data. Participants who cannot provide their own linguistic dataset can join another’s miniproject or some of the ones proposed by the organisers. There will be an award to the best miniproject.

Participants should bring their own laptops to follow the hacking sessions, but they will be provided with digital copies of all the material used during the course and will have assistance for installing all the required software.

Organisers

Christian Chiarcos

Applied Computational Linguistics (ACoLi) Lab, Goethe Universität Frankfurt

As Professor of Computer Science at Goethe University Frankfurt, Germany, I am heading the Applied Computational Linguistics (ACoLi) lab since 2013, and the research group "Linked Open Dictionaries (LiODi)" since 2015. My research focuses on semantic technologies, including computational semantics as well as the innovative application of Linked Data formalisms to problems and resources in NLP and Digital Humanities.

For further information, please visit my website

John Philip McCrae

Insight Centre for Data Analytics, NUI Galway

I am a lecturer above-the-bar at the Insight Centre for Data Analytics at the National University of Ireland Galway. I am currently working with Paul Buitelaar in the Unit for Natural Language Processing. My main research has focused around the development of linguistic linked open data and in particular the development of models for the representation of lexical resources, by means of the lemon and OntoLex models.

For further information, please visit my personal website

Jorge Gracia

Aragon Institute of Engineering Research (I3A), University of Zaragoza

I am an assistant professor at the Department of Computer Science and Systems Engineering (University of Zaragoza, Spain). I develop my research activities at the Distributed Information Systems group, belonging to the Aragon Institute of Engineering Research (I3A). My main research interests are Semantic Web, Ontology Matching, Multilingual Web of Data, Query Interpretation, and Linguistic Linked Data.

For further information, please visit my personal website

Local Organisers

Local organization is handled by the members of the Research Group “Linked Open Dictionaries (LiODi)”, funded by the German Federal Ministry for Education and Science (BMBF).

Monika Rind-Pawlowski

Research Group ``Linked Open Dictionaries''
Goethe Universität Frankfurt am Main, Germany

Hasmik Sargsian

Research Group ``Linked Open Dictionaries''
Goethe Universität Frankfurt am Main, Germany

Jesse Wichers Schreur

Research Group ``Linked Open Dictionaries''
Goethe Universität Frankfurt am Main, Germany

How to Apply

We welcome participants from anywhere in the world and coming from industry or academia. Some basic acquaintance with software development and Web technologies is recommended. Participants are expected to participate fully in the activities of the datathon until its conclusion.

Fees: All SD-LLOD activities, including lecturers, tutors, teaching materials and social activities are sponsored by supporting research projects. Participants only cover the expenses for their stay, which are handled directly with the venue, Schloss Dagstuhl. Participants should stay for the entire duration of the event.

single room: 350 EUR total, full board
shared room:* 275 EUR total, full board
(*single room with shared restroom, upon availability)

Note that datathon participants are entitled to a reduced attendance fee to the 2nd Conference on Language, Data and Knowledge (LDK-2019), Leipzig, Germany, May 20-22, 2019, please see the LDK registration page for details.

Registration: Registration is now open. You can apply by sending an email to datathon@linguistic-lod.org. Please provide a short description of

research interests and affiliation
proposal of a mini-project [optional, as attachment, see below]
dietary preferences [optional]
accomodation preferences [single room / shared room]

Potential participants will make an application until April, 4 2019. You will receive the selection result at April, 12th 2019.

Participants are encouraged to contribute to the datathon with their own data, their own research and their own challenges. If you want to propose a topic for a mini-project in the datathon (e.g., a language resource to be converted into linked data, a LLOD dataset to be linked to other resources, a use case description that exploits the LLOD cloud, ...) or want to report on some recent research related to the topics of the datathon, you can write a short description of your ideas (less than 1000 words), to be sent to the organisers via email as part of the registration process by 4th April. Selected mini-project proposals and abstracts will be highlighted and presented during the event.

Important Dates

Registration opens: January, 18th 2019
Registration closes: April, 4th 2019
Notification: April, 12th 2019
Datathon: May, 12th to 17th 2019
Payment: onsite only, upon arrival, starting May, 12th 2019, 15:00 CEST

Invited speakers

Gerard de Melo

Director of the Deep Data Lab, Department of Computer Science
Rutgers University, New Jersey

Gerard de Melo is an Assistant Professor at Rutgers University (NJ, USA), where he heads the Deep Data Lab. Over the years, he has published over 100 papers on natural language processing, AI, and Big Data analytics, with Best Paper/Demo awards at WWW 2011, CIKM 2010, ICGL 2008, and the NAACL 2015 Workshop on Vector Space Modeling. Notable research projects include Lexvo.org, FrameBase.org, the Universal WordNet, and the Etymological WordNet. Prior to joining Rutgers, he was a faculty member at Tsinghua University and a Post-Doctoral Research Scholar at ICSI/UC Berkeley. He received his doctoral degree at the Max Planck Institute for Informatics.

For further information, please consult his website

Richart Eckart de Castilho

Ubiquitous Knowledge Processing (UKP) Lab, Department of Computer Science
Technische Universität Darmstadt, Germany

Dr. Richard Eckart de Castilho is a senior research at the Ubiquitous Knowledge Processing Lab, TU Darmstadt. He is interested in architectures, tools and infrastructures for the automatic and interactive analysis of text data. Richard is presently building a next-generation text annotation platform as a PI in the DFG-funded project INCEpTION, member of the CEDIFOR Digital Humanities Centre, member of the Apache Software Foundation, as well as the maintainer of DKPro Core, WebAnno, Apache uimaFIT and involved in various other open source projects related to NLP.

For further information, please consult his website

Tutors

Sina Ahmadi
NUIG Galway, Ireland

Julia Bosque-Gil
UPM Madrid, Spain

Christian Fäth
Goethe Universität Frankfurt, Germany

Max Ionov
Goethe Universität Frankfurt, Germany

Bettina Klimek
AKSW Leipzig, Germany

Alessandro Adamou
NUI Galway, Ireland

Andon Tchechmedjiev
IMT École des Mines d'Alès, France

Thierry Declerck
DFKI GmbH

Venue

The Summer Datathon on Linguistic Linked Open Data (SD-LLOD-19) will be held from May 12th to 17th, 2019 at Schloss Dagstuhl – Leibniz Center for Informatics, in Wadern, Germany. Schloss Dagstuhl is a prominent location for workshops and meetings in computer science situated in an idyllic rural environment near the French and Luxembourg borders, and easily reachable via the cities of Mainz, Saarbrücken or Trier, resp. the airports Frankfurt am Main (FRA), Frankfurt-Hahn (HHN), Saarbrücken (SCN) or Luxembourg (LUX). For details on getting there, please see the Schloss Dagstuhl arrival information.

photo (c) L. Sieht, Wikipedia, CC-BY 3.0

Schedule

	Sun 12/5	Mon 13/5	Tue 14/5	Wed 15/5	Thu 16/5	Fri 17/5
07:30 - 08:45		breakfast	breakfast	breakfast	breakfast	breakfast
09:00 - 09:30		Welcome	Presentation of participant groups	Practical Session: Generating & Publishing Language Resources	Seminar: Metadata	Seminar: OntoLex extensions
09:30 - 10:00		Introduction: Linguistic Linked Open Data	Invited Talk: Richard Eckart de Castilho		Practical Session: Metadata	Practical Session: OntoLex extensions
10:00 - 10:30		Introduction: Linguistic Linked Open Data	Invited Talk: Richard Eckart de Castilho		Datathon	Practical Session: OntoLex extensions
10:30 - 11:00		break	break	break	break	break
11:00 - 11:30		Practical Session: Linguistic Linked Open Data	Seminar: OntoLex-lemon	Practical Session: Linking Datasets	Datathon	Feedback and Review Session
11:30 - 12:00		Practical Session: Linguistic Linked Open Data	Seminar: OntoLex-lemon	Practical Session: Linking Datasets	Datathon	Feedback and Review Session
12:00 - 12:30		lunch	lunch	lunch	lunch	lunch
12:30 - 13:00			lunch	lunch	lunch	lunch
13:00 - 13:30			daily report (tutors only)	daily report (tutors only)	daily report (tutors only)	Datathon Presentations
13:30 - 14:00		Seminar: Ontologies	Seminar: Annotations & NLP	Datathon	Datathon
14:00 - 14:30		Seminar: Ontologies	Seminar: Annotations & NLP
14:30 - 15:00		Practical Session: Ontologies	Practical Session: SPARQL & CoNLL-RDF			Invited Talk: Gerard de Melo
15:00 - 15:30	Arrival & Registration	Practical Session: Ontologies	Practical Session: SPARQL & CoNLL-RDF			Invited Talk: Gerard de Melo
15:30 - 16:00	Arrival & Registration	coffee break	coffee break	coffee break	coffee break	coffee break
16:00 - 16:30	Installfest: Technical Setup (all participants)	Participant's minute madness	Datathon	Excursion (Trier)	Datathon	Conclusion & Awards
16:30 - 17:00		Participant's minute madness				Conclusion & Awards
17:00 - 17:30		Group Formation & Project Selection				Departure: 17:00
17:30 - 18:00		Group Formation & Project Selection
18:00 - 18:30	Dinner & Icebreaking	Dinner	Dinner		Dinner
18:30 - 19:00	Dinner & Icebreaking	Dinner	Dinner		Dinner
19:00 - 19:30	Icebreaking Session	Datathon	Excursion		Social Evening
19:30 - 20:00				Conference Dinner
20:00 - 20:30				Conference Dinner

Rooms

By default, we meet in Lecture Room Saarbrücken (LH-Sb).

acronym	name	building	lectures	practical session	datathon sessions	mini project (tutor)
LH-Sb	Lecture Hall Saarbrücken	new building	X	X	X	6 (Sina) & 7 (Julia)
LH-Kl	Lecture Hall Kaiserslautern	old building	_	X*	X	1 (Thierry)
S006	Cafeteria	old building	_	X**	X***	3 (Andon)
S104	S104	old building	_	_	X	4 (Bettina)
S003	S003	old building	_	_	X	2 (Christian F.)
News	News Room / Wappensaal	old building	_	_	X	8 (Alessandro)
Trier	Room Trier	old building	_	_	X	5 (Max)

* if necessary, t.b.a. in the session before (Mon-Thu),
** instead of LH-Kl, Friday (Fri) only,
*** if occupied, please move to LH-Kl (Mon-Thu)

About LLOD and the SD-LLOD datathon series

In natural language processing, linguistics, and neighboring fields, Linguistic Linked Open Data (LLOD) describes a method and an interdisciplinary community concerned with creating, sharing and (re-)using language resources in accordance with Linked Data principles. The Linguistic Linked Open Data cloud was conceived and is maintained by the Open Linguistics Working Group (OWLG) of Open Knowledge International, and has been a point of focal activity for several W3C community groups, research projects and infrastructure efforts since then.

To a large extent, LLOD development has been driven forward by international workshops and accompanying hackathons, as organized, for example, in the context of workshops on Multilingual Linked Open Data for Enterprises in 2012 and 2014 in Leipzig, Germany. Since 2015, these are organized in the form of bi-annual summer schools: The first Summer Datathon on Linguistic Linked Open Data (SD-LLOD’15) was held in June 2015 in Cercedilla, Madrid, Spain, as was the second Summer Datathon on Linguistic Linked Open Data (SD-LLOD’17) in July 2017. The 2019 edition is organized in conjunction with and held before the 2nd International Conference on Language, Data and Knowledge (LDK-2019, May 20th-22th, Leipzig, Germany).

Notable outcomes of earlier datathon editions include the first installment of the LLOD cloud and the LLOD cloud diagram (as a result of MLODE-2012), a large number of converted resources, and numerous scientific publications, and thesis projects that build on successful mini-projects, experiments or case studies conducted at or initiated during the previous SD-LLOD datathon.

3rd Summer Datathon on Linguistic Linked Open Data (SD-LLOD-19)

Outcomes

Topics

Organisers

Christian Chiarcos

Applied Computational Linguistics (ACoLi) Lab, Goethe Universität Frankfurt

John Philip McCrae

Insight Centre for Data Analytics, NUI Galway

Jorge Gracia

Aragon Institute of Engineering Research (I3A), University of Zaragoza

Local Organisers

Monika Rind-Pawlowski

Research Group ``Linked Open Dictionaries'' Goethe Universität Frankfurt am Main, Germany

Hasmik Sargsian

Research Group ``Linked Open Dictionaries'' Goethe Universität Frankfurt am Main, Germany

Jesse Wichers Schreur

Research Group ``Linked Open Dictionaries'' Goethe Universität Frankfurt am Main, Germany

How to Apply

Important Dates

Invited speakers

Gerard de Melo

Director of the Deep Data Lab, Department of Computer Science Rutgers University, New Jersey

Richart Eckart de Castilho

Ubiquitous Knowledge Processing (UKP) Lab, Department of Computer Science Technische Universität Darmstadt, Germany

Tutors

Venue

Schedule

Rooms

About LLOD and the SD-LLOD datathon series

Research Group ``Linked Open Dictionaries''
Goethe Universität Frankfurt am Main, Germany

Research Group ``Linked Open Dictionaries''
Goethe Universität Frankfurt am Main, Germany

Research Group ``Linked Open Dictionaries''
Goethe Universität Frankfurt am Main, Germany

Director of the Deep Data Lab, Department of Computer Science
Rutgers University, New Jersey

Ubiquitous Knowledge Processing (UKP) Lab, Department of Computer Science
Technische Universität Darmstadt, Germany