Article published In: Romance Parsed Corpora
Edited by Christina Tortora, Beatrice Santorini and Frances Blanchette
[Linguistic Variation 18:1] 2018
► pp. 74–99
Special issue articles
Diachronic syntax based on constituency and dependency annotated corpora
Theoretical and methodological issues
Published online: 13 July 2018
https://doi.org/10.1075/lv.00005.ste
https://doi.org/10.1075/lv.00005.ste
Abstract
This contribution presents two syntactically annotated corpora of Old French, Modéliser le changement: les voies du français (MCVF) and the Syntactic Reference Corpus of Medieval French (SRCMF). The focus is on how the underlying syntactic theory (constituency vs. dependency) influences the grammar model and how this choice is reflected in the syntactic annotations of the corpora. The comparison relates to the most relevant general properties of the corpora as well as to two phenomena, null subjects and cleft constructions. Null subjects highlight possible conflicts between syntactic annotation models and syntactic theory, and the information-structural properties of cleft constructions pose a particular problem for the interpretation and annotation of historical corpora. Both phenomena are major instances of diachronic variation in French. The study is relevant for corpus users working on diachronic syntax, as well for corpus builders wishing to design a grammar model for annotation.
Article outline
- 1.Introduction: Syntactic annotation and variation
- 1.1Variation in Old French
- 1.2Parsed corpora of Old French
- 2.Grammar models and syntactic annotation
- 2.1Principles of structure
- 2.2Search strategies
- 2.3Null subjects
- 3.Cleft constructions as a phenomenon of linguistic variation
- 3.1Clefts and information structure
- 3.2Types of cleft constructions
- 3.3Clefts in diachrony
- 3.3.1Latin
- 3.3.2French
- 4.Cleft constructions in the Old French corpora
- 4.1Corpora and queries
- 4.2Explicit cleft annotation
- 4.3Unmarked relative clauses
- 4.4Comparing the corpora
- 5.Conclusion
- Notes
References
References (47)
Bosco, Cristina. 2004. A grammatical relation system for treebank annotation. Turino, Italy: Università degli Studi di Turino dissertation.
Bouchard, Jacynthe, Fernande Dupuis & Monique Dufresne. 2007. Un processus de focalisation en ancien français: le développement des clivées. In Milica Radišić (ed.), Actes du Congrès annuel de l’Association canadienne de linguistique (ACL) 2007, Association Canadienne de Linguistique. [URL].
Collins, Peter. 1991. Cleft and pseudo-cleft constructions in English. London & New York: Routledge.
Combettes, Bernard. 1999. Thématisation et topicalisation: leur rôle respectif dans l’évolution du français. In Claude Guimier (ed.), La thématisation dans les langues. Actes du colloque de Caen, 9–11 octobre 1997, 231–245. Paris: Peter Lang.
Dees, Anthonij. 1987. Atlas des formes linguistiques des textes littéraires de l’ancien français. Avec le concours de M. Dekker, O. Huber et K. van Reenen-Stein. Beihefte zur Zeitschrift für romanische Philologie 212. Tübingen: Niemeyer.
Dufter, Andreas. 2008. On explaining the rise of c’est-clefts in French. In Ulrich Detges & Richard Waltereit (eds.), The Paradox of Grammatical Change, 31–56. Amsterdam & Philadelphia: Benjamins.
Erteschik-Shir, Nomi. 2007. Information structure: the syntax-discourse interface. Oxford: Oxford University Press.
Guillot, Céline, Christiane Marchello-Nizia & Alexeij Lavrentiev. 2007. La Base de Français Médiéval (BFM): états et perspectives. In Pierre Kunstmann & Achim Stein (eds.), Le Nouveau Corpus d’Amsterdam. Actes de l’atelier de Lauterbad, 23–26 février 2006. Stuttgart: Steiner.
Gärtner, Markus, Gregor Thiele, Wolfgang Seeker, Anders Björkelund & Jonas Kuhn. 2013. ICARUS – An Extensible Graphical Search Tool for Dependency Treebanks. In Proceedings of the Annual Meeting of the Association for Computational Linguistics 2013, [URL].
Jochimsen, Paul. 1907. Beiträge zur Geschichte der deiktischen Hervorhebung eines einzelnen Satzteiles, bezw. eines Satzes mittelst c’est (…) que (qui). Kiel, Germany: Universität Kiel.
Krifka, Manfred. 2007. Basic Notions of Information Structure. In Caroline Féry, Gisbert Fanselow & Manfred Krifka (eds.), The Notions of Information Structure, 13–55. Potsdam: Universitätsverlag Potsdam.
Kroch, Anthony & Ann Taylor. 2000. The Penn-Helsinki Parsed Corpus of Middle English (PPCME2). Department of Linguistics, University of Pennsylvania. CD-ROM, second edition, release 4 ([URL]).
Kroch, Anthony, Beatrice Santorini & Ariel Diertani. 2010. The Penn-Helsinki Parsed Corpus of Modern British English (PPCMBE). Department of Linguistics, University of Pennsylvania. CD-ROM, first edition, ([URL]).
. 2016. The Penn Parsed Corpus of Modern British English (PPCMBE2). Department of Linguistics, University of Pennsylvania. CD-ROM, second edition, release 1 ([URL]).
Kunstmann, Pierre & Achim Stein. 2007. Le Nouveau Corpus d’Amsterdam. In Pierre Kunstmann & Achim Stein (eds.), Le Nouveau Corpus d’Amsterdam. Actes de l’atelier de Lauterbad, 23–26 février 2006, 9–27. Stuttgart: Steiner.
Lambrecht, Knud. 1994. Information structure and sentence form: topic, focus, and the mental representations of discourse referents. Cambridge: Cambridge University Press.
Lezius, Wolfgang. 2002. Ein Suchwerkzeug für syntaktisch annotierte Textkorpora (German) University of Stuttgart Arbeitspapiere des Instituts für Maschinelle Sprachverarbeitung (AIMS), vol. 8, no. 4. Stuttgart: Institut für Maschinelle Sprachverarbeitung (IMS).
Löfstedt, Bengt. 1966. Die Konstruktion ‘c’est lui qui l’a fait’ im Lateinischen. Indogermanische Forschungen 711. 253–277.
Marchello-Nizia, Christiane. 1999. Le français en diachronie: douze siècles d’évolution. Paris: Ophrys.
. 2009. Histoire interne du français: morphosyntaxe et syntaxe. In Gerhard Ernst, Martin-Dietrich Gleßgen, Christian Schmitt & Wolfgang Schweickard (eds.), Romanische Sprachgeschichte. Ein internationales Handbuch zur Geschichte der romanischen Sprachen und ihrer Erforschung, Teilband 3: Handbücher zur Sprach- und Kommunikationswissenschaft, 2926–2947. Berlin & New York: De Gruyter.
Marcus, Mitchell P., Beatrice Santorini & Mary Ann Marcinkiewicz. 1993. Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics 191. 313–330. Reprinted in Susan Armstrong, ed., 1994, Using large corpora. Cambridge, MA: MIT Press. 273–290.
Martineau, France. 2008. Un corpus pour l’analyse de la variation et du changement linguistique. Corpus 71. [URL].
(ed.). 2009. Le corpus MCVF. Modéliser le changement: les voies du français. Ottawa: Université d’Ottawa. [URL].
Martineau, France, Constanta Diaconescu & Paul Hirschbühler. 2007. Le corpus ‘Voies du français’: de l’élaboration à l’annotation. In Pierre Kunstmann & Achim Stein (eds.), Le Nouveau Corpus d’Amsterdam. Actes de l’atelier de Lauterbad, 23–26 février 2006, 121–142. Stuttgart: Steiner.
Mazziotta, Nicolas. 2010. Building the ‘Syntactic Reference Corpus of Medieval French’ using NotaBene RDF Annotation Tool. In Proceedings of the 4th Linguistic Annotation Workshop (LAW IV), [URL].
Mazziotta, Nicolas, Beatrice Bischof, Julie Glikman & Thomas Rainsford. 2012. ‘Ce’ sujet dans les ‘constructions impersonnelles’ du roman de Tristan de Béroul. L’information grammaticale 1321. 48–52.
Polguère, Alain & Igor Mel’čuk (eds.). 2009. Dependency in linguistic description. Amsterdam & Philadelphia: Benjamins.
Prévost, Sophie & Achim Stein (eds.). 2013. Syntactic reference corpus of Medieval French (SRCMF). Lyon & Stuttgart: ENS de Lyon; Lattice & Paris; Universität Stuttgart. [URL].
. 1981. Toward a taxonomy of given/new information. In Peter Cole (ed.), Radical pragmatics, 223–254. New York: Academic Press.
. 1992. The ZPG letter: Subjects, definiteness, and information-status. In William Mann & Sandra Thompson (eds.), Discourse description: diverse linguistic analyses of a fund-raising text, 295–325. Amsterdam: Benjamins.
Reinhart, Tanya. 1981. Pragmatics and linguistics: An analysis of sentence topics. Philosophica 271. 53–94.
Rinke, Esther & Jürgen Meisel. 2009. Subject-inversion in Old French: Syntax and information structure. In Georg Kaiser & Eva-Maria Remberger (eds.), Proceedings of the Workshop ‘Null subjects, expletives, and locatives in Romance’, Arbeitspapiere Fachbereich Sprachwissenschaft 123, 93–130. Konstanz: Fachbereich Sprachwissenschaft. [URL]: 352-opus-78604.
Roberts, Craige. 1998. Focus, the flow of information, and universal grammar. In Peter W. Culicover & Louise McNally (eds.), The limits of syntax, 109–160. San Diego: Academic Press.
Rouquier, Magali. 2007. Les constructions clivées en ancien français et en moyen français. Romania 125(1–2). 167–212.
Stein, Achim & Sophie Prévost. 2013. Syntactic annotation of medieval texts: the Syntactic Reference Corpus of Medieval French (SRCMF). In Paul Bennett, Martin Durrell, Silke Scheible & Richard Whitt (eds.), New Methods in Historical Corpora, 275–282. Tübingen: Narr.
Stein, Achim et al. (eds.). 2006. Nouveau Corpus d’Amsterdam. Corpus informatique de textes littéraires d’ancien français (ca 1150–1350), établi par Anthonij Dees (Amsterdam 1987), remanié par Achim Stein, Pierre Kunstmann et Martin-D. Gleßgen. Stuttgart: Institut für Linguistik/Romanistik. [URL].
Waters, Edwin G. R. (ed.). 1974. The Anglo-Norman Voyage of St. Brendan by Benedeit. A poem of the early twelfth century. Genève: Slatkine Reprints.
Wehr, Barbara. 2005. Focusing strategies in Old French and Old Irish. In Janne Skaffari, Matti Peikola, Ruth Carroll, Risto Hiltunen & Brita Wårvik (eds.), Opening windows on texts and discourses of the past, 354–379. Amsterdam: John Benjamins.
