In:The Pragmatics of Discourse Coherence: Theories and applications
Edited by Helmut Gruber and Gisela Redeker
[Pragmatics & Beyond New Series 254] 2014
► pp. 121–141
Resolving connective ambiguity
a prerequisite for discourse parsing
Published online: 26 November 2014
https://doi.org/10.1075/pbns.254.05ste
https://doi.org/10.1075/pbns.254.05ste
Automatic discourse parsing refers to the identification of coherence relations
and deriving a structural description for a text. Such parsers can derive much
information from the presence of surface cues, especially connectives. These
lexical signals, however, are ambiguous: Many have additional, non-connective
readings; also, many connectives can signal more than one coherence relation.
In this paper, we discuss the first problem, focusing on English and German:
How many connectives are ambiguous, and how frequent are these in the two
languages? Then we examine computational approaches for resolving such
ambiguities. For English, we provide an overview of relevant work by other
researchers, while for German we largely present our own studies on the utility
of part-of-speech tagging for connective disambiguation.
References (33)
Asher, Nicholas, and Alex Lascarides. 2003. Logics of Conversation. Cambridge: Cambridge University Press.
Bayerl, Petra. 2004. “Disambiguierung deutschsprachiger Diskursmarker: Eine Pilotstudie.” Linguistik Online 18: 3–17.
Berzlánovich, Ildikó, and Gisela Redeker. 2012. “Genre-dependent Interaction of Coherence and Lexical Cohesion in Written Discourse.” Corpus Linguistics and Linguistic Theory 8: 183–208.
Brants, Sabine, Stefanie Dipper, Peter Eisenberg, Silvia Hansen, Esther König, Wolfgang Lezius, Christian Rohrer, George Smith, and Hans Uszkoreit. 2004. “TIGER: Linguistic Interpretation of a German Corpus.” Research on Language and Computation 2 (4): 597–620.
Brill, Eric. 1992. “A Simple Rule-based Part-of-speech Tagger.” In
Proceedings of the 3rd Conference on Applied Natural Language Processing
(ANLP)
, 152–155. Trento.
Carlson, Lynn, Daniel Marcu, and Mary Ellen Okurowski. 2003. “Building a Discourse-tagged Corpus in the Framework of Rhetorical Structure Theory.” In Current Directions in Discourse and Dialogue, ed. by Jan van Kuppevelt, and Ronnie Smith, 85–112. Dordrecht: Kluwer.
Dipper, Stefanie, and Manfred Stede. 2006. “Disambiguating Potential Connectives.” In
Proceedings of Konferenz zur Verarbeitung natürlicher Sprache
(KONVENS), 167–173. Konstanz.
Egg, Markus, and Gisela Redeker. 2010. “How Complex is Discourse Structure?” In
Proceedings of the Conference on Language Resources and Evaluation
(LREC)
, 1619–1623. Malta.
Grote, Brigitte. 2004. Signaling Coherence Relations: Temporal Markers and their Role in Text Generation. Dissertation, FB Sprach- und Literaturwissenschaft, Universität Bremen.
Hernault, Hugo, Helmut Prendinger, David A. duVerle, and Mitsuru Ishizuka. 2010. “HILDA – A Discourse Parser Using Support Vector Machine Classification.” Dialogue and Discourse 1 (3): 1–33.
Hirschberg, Julia, and Diane Litman. 1993. “Empirical Studies on the Disambiguation of Cue Phrases.” Computational Linguistics 19 (3): 501–530.
Knott, Alistair. 1996. A Data-driven Methodology for Motivating a Set of Coherence Relations. Ph.D. thesis, University of Edinburgh.
Mann, William C., and Sandra A. Thompson. 1988. “Rhetorical Structure Theory: Toward a Functional Theory of Text Structure.” Text 8 (3): 243–283.
Marcu, Daniel. 2000. “The Rhetorical Parsing of Unrestricted Texts: A Surface-based Approach.” Computational Linguistics 26 (3): 395–448.
Martin, James. 1992. English Text – System and Structure. Amsterdam: John Benjamins.
Pasch, Renate, Ursula Brauße, Eva Breindl, and Ulrich Hermman Waßner. 2003. Handbuch der deutschen Konnektoren. Berlin/New York: Walter de Gruyter.
Pitler, Emily, and Ana Nenkova. 2009. “Using Syntax to Disambiguate Explicit Discourse Connectives in Text.” In
Proceedings of the ACL/IJCNLP Conference Short Papers
, 13–19. Suntec/Singapore.
Polanyi, Livia, and Remko Scha. 1994. “A Syntactic Approach to Discourse Semantics.” In
Proceedings of the 10th International Conference on Computational Linguistics (Coling) and 22nd Annual Meeting of the Association for Computational Linguistics (ACL)
, 413–419. Stanford University.
Prasad, Rashmi, Nikhil Dinesh, Alan Lee, Eleni Miltsakaki, Livio Robaldo, Aravind Joshi, and Bonnie Webber. 2008. “The Penn Discourse TreeBank 2.0.” In
Proceedings of the Conference on Language Resources and Evaluation (LREC)
, Marrakech.
Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik. 1985. A Comprehensive Grammar of the English Language, 2nd ed. London: Longman.
Redeker, Gisela, Ildikó Berzlánovich, Nynke van der Vliet, Gosse Bouma, and Markus Egg. 2012. “Multi-Layer Discourse Annotation of a Dutch Text Corpus.” In
Proceedings of the Conference on Language Resources and Evaluation (LREC)
, 2820–2825. Istanbul.
Schmid, Helmut. 1994. “Probabilistic Part-of-speech Tagging Using Decision Trees.” In
Proceedings of the International Conference on New Methods in Language Processing
, Manchester.
Schneider, Angela, and Manfred Stede. 2012. “Ambiguity in German Connectives: A Corpus Study.” In
Proceedings of the Konferenz zur Verarbeitung natürlicher Sprache (KONVENS)
, Vienna.
Smith, Carlota. 2003. Modes of Discourse – The Local Structure of Texts. Cambridge: Cambridge University Press.
Stede, Manfred. 2002. “DiMLex: A Lexical Approach to Discourse Markers.” In Exploring the Lexicon – Theory and Computation, ed. by Alessandro Lenci, and Vittorio Di Tomaso. Alessandria: Edizioni dell’Orso.
Stede, Manfred, and Arne Neumann. 2014. “Potsdam Commentary Corpus 2.0: Annotation for Discourse Research.” In
Proceedings of the Conference on Language Resources and Evaluation (LREC)
, Reykjavik.
Taboada, Maite. 2006. “Discourse Markers as Signals (or Not) of Rhetorical Relations.” Journal of Pragmatics 38: 567–592.
Taylor, Ann, Mitchell Marcus, and Beatrice Santorini. 2003. “The Penn Treebank: An Overview.” In Treebanks: Building and Using Parsed Corpora, ed. by Anne Abeillé, 5–22. Dordrecht: Kluwer.
Webber, Bonnie. 2009. “Genre Distinctions for Discourse in the Penn TreeBank.” In
Proceedings of the 47th Annual Meeting of the ACL and the 4th IJCNLP of the AFNLP
, 674–682. Singapore.
Cited by (2)
Cited by two other publications
Stede, Manfred, Tatjana Scheffler & Amália Mendes
Crible, Ludivine
2017. Towards an operational category of discourse markers. In Pragmatic markers, Discourse Markers and Modal Particles [Studies in Language Companion Series, 186], ► pp. 99 ff.
This list is based on CrossRef data as of 28 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
