Utterances in natural language do not occur in isolation, but rather are part of larger texts or conversations. A central notion of such larger discourses is coherence, the relatedness of utterances to each other. In the first part of this workshop, I will introduce a shallow model of discourse coherence based on discourse relations, such as CAUSE or PRECEDENCE. These discourse relations are typically marked by words or phrases called connectives. In practical exercises, we will discuss the notion of connectives and explore corpora annotated for discourse coherence.

The second part of the workshop addresses the computational task of automatically identifying the shallow discourse structure of texts (shallow discourse parsing). I will present different existing approaches to this task. We will try out available discourse parsers on the text and determine their advantages and disadvantages. Finally, we will work on implementing a module of a discourse parser together.


Tatjana Scheffler teaches computational linguistics in the Department of Linguistics at the University of Potsdam, Germany. She received her Ph.D. in Linguistics from the University of Pennsylvania and has worked as a researcher in intelligent multimodal interfaces at the German Research Center for Artificial Intelligence (DFKI). Her research interests are discourse structure, the analysis of social media conversations, and computational social science. Her methods include formal theoretical linguistics as well as corpus and computational linguistics. She is Co-PI of the research project “Discourse Strategies across Social Media,” investigating discourse-level linguistic variability within the Collaborative Research Cluster on “Limits of Variability in Language” at the University of Potsdam.