Who Will Be

Part I: Theory of Discourse Coherence

► Who will be interested: Anyone who has knowledge of linguistic structures and is interested in learning how single sentences interact within a piece of text.

► Prerequisites:
– Basic knowledge of and interest in language structures
– Basic knowledge of automatic text processing
– A working laptop

Part II: Shallow Discourse Parsing

► Who will be interested: Anyone who loves processing linguistic structures programmatically and who is interested in learning to process text beyond single sentences.

► Prerequisites:
– All of the prerequisites listed for Part I
– Programming in Python
The data and software for Part II will be sent to the registered participants via email.


Utterances in natural language do not occur in isolation, but rather are part of larger texts or conversations. A central notion of such larger discourses is coherence, the relatedness of utterances to each other. In the first part of this workshop, I will introduce a shallow model of discourse coherence based on discourse relations, such as CAUSE or PRECEDENCE. These discourse relations are typically marked by words or phrases called connectives. In practical exercises, we will discuss the notion of connectives and explore corpora annotated for discourse coherence.

The second part of the workshop addresses the computational task of automatically identifying the shallow discourse structure of texts (shallow discourse parsing). I will present different existing approaches to this task. We will try out available discourse parsers on the text and determine their advantages and disadvantages. Finally, we will work on implementing a module of a discourse parser together.


Tatjana Scheffler teaches computational linguistics in the Department of Linguistics at the University of Potsdam, Germany. She received her Ph.D. in Linguistics from the University of Pennsylvania and has worked as a researcher in intelligent multimodal interfaces at the German Research Center for Artificial Intelligence (DFKI). Her research interests are discourse structure, the analysis of social media conversations, and computational social science. Her methods include formal theoretical linguistics as well as corpus and computational linguistics. She is Co-PI of the research project “Discourse Strategies across Social Media,” investigating discourse-level linguistic variability within the Collaborative Research Cluster on “Limits of Variability in Language” at the University of Potsdam.