In:Crossroads Semantics: Computation, experiment and grammar
Edited by Hilke Reckman, Lisa Lai-Shen Cheng, Maarten Hijzelendoorn and Rint Sybesma
[Not in series 210] 2017
► pp. 107–123
Chapter 7Frequential test of (S)OV as unmarked word order in Dutch and German clauses
A serendipitous corpus-linguistic experiment
Gerard Kempen | Max Planck Institute for Psycholinguistics, Nijmegen | Cognitive Psychology Unit, Leiden University
Published online: 12 April 2017
https://doi.org/10.1075/z.210.07kem
https://doi.org/10.1075/z.210.07kem
Abstract
In a paper entitled “Against markedness (and what to replace it with)”, Haspelmath argues “that the term ‘markedness’ is superfluous”, and that frequency asymmetries often explain structural (un)markedness asymmetries (Haspelmath 2006). We investigate whether this argument applies to Object and Verb orders in main (VO, marked) and subordinate (OV, unmarked) clauses of spoken and written German and Dutch, using English (without VO/OV alternation) as control. Frequency counts from six treebanks (three languages, two output modalities) do not support Haspelmath’s proposal. However, they reveal an unexpected phenomenon, most prominently in spoken Dutch and German: a small set of extremely high-frequent finite verbs with unspecific meanings populates main clauses much more densely than subordinate clauses. We suggest these verbs accelerate the start-up of grammatical encoding, thus facilitating sentence-initial output fluency.
Keywords: SOV, SVO, markedness, German, Dutch, verb frequency, sentence planning, fluency, corpus linguistics, psycholinguistics
Article outline
- 1.Introduction
- 2.Methodology
- 3.Three frequential tests
- 4.Discussion: Time and fluency pressures can boost VO:OV ratios
Notes References Appendix
References (20)
Beek, Leonoor van der, Gosse Bouma, Robert Malouf & Gertjan van Noord. 2002. The Alpino Dependency Treebank. In Tanja Gaustad (ed.), Computational Linguistics in the Netherlands 2001. Amsterdam: Rodopi.
Brants, Sabine, Stefanie Dipper, Peter Eisenberg, Silvia Hansen-Schirra, Esther König, Wolfgang Lezius, Christian Rohrer, George Smith & Hans Uszkoreit. 2004. TIGER: Linguistic Interpretation of a German Corpus. Research on Language and Computation 2. 597–620.
Charniak, Eugene, Don Blaheta, Niyu Ge, Keith Hall, John Hale & Mark Johnson. 2000. BLLIP 1987–89 WSJ Corpus Release 1 LDC2000T43. DVD. Philadelphia: Linguistic Data Consortium.
Drach, Erich. 1937. Grundgedanken der deutschen Satzlehre. Frankfurt am Main: Diesterweg. [Reprinted in 1963]
Dryer, Matthew. 1995. Frequency and pragmatically unmarked word order. In Mickey Noonan & Pamela Downing (eds.), Word order in discourse, 105–135. Amsterdam: John Benjamins.
Fitz, Hartmut, Franklin Chang & Morten H. Christiansen. 2011. A connectionist account of the acquisition and processing of relative clauses. In Evan Kidd (ed.), The acquisition of relative clauses. Processing, typology and function, 39–60. Amsterdam: Benjamins.
Godfrey, John J., Eduard C. Holliman & Jane McDaniel. 1992. SWITCH-BOARD: Telephone speech corpus for research and development. In Proceedings of the International Conference on Audio, Speech and Signal Processing (ICASSP-92), 517–520.
Haspelmath, Martin. 2006. Against markedness (and what to replace it with). Journal of Linguistics 42. 25–70.
Haider, Hubert. 2010. Wie wurde Deutsch OV? Zur diachronen Dynamik eines Strukturparameters der germanischen Sprachen. In Arne Ziegler (ed.), Historische Textgrammatik und Historische Syntax des Deutschen – Traditionen, Innovationen, Perspektiven, 11–32. Berlin: De Gruyter.
Höhle, Tilman N. 1986. Der Begriff ‘Mittelfeld’: Anmerkungen über die Theorie der topologischen Felder. In Walter Weiss, Herbert E. Wiegand & Marga Reis (eds.), Akten des VII. Internationalen Germanistenkongresses, 329–340. Tübingen: Niemeyer.
Hoekstra, Heleen, Michael Moortgat, Ineke Schuurman & Ton van der Wouden. 2001. Syntactic annotation for the spoken Dutch corpus project (CGN). Language and Computers 37(1). 73–87.
Kempen, Gerard & Karin Harbusch. 2016. Verb-second word order after German weil ‘because’: Psycholinguistic theory from corpus-linguistic data. Glossa: a journal of general linguistics 1(1). 1–32.
König, Esther & Wolfgang Lezius. 2003. The TIGER language: A Description Language for Syntax Graphs, Formal Definition. Stuttgart: University of Stuttgart.
MacDonald, Maryellen C., Jessica L. Montag & Silvia P. Gennari. 2016. Are there really syntactic complexity effects in sentence production? A reply to Scontras, et al. (2015). Cognitive Science, 40. 513–518.
Noord, Gertjan van, Gosse Bouma, Frank van Eynde, Daniël de Kok, Jelmar van der Linde, Ineke Schuurman, Erik Tjong Kim Sang, & Vincent Vandeghinste. 2013. Large scale syntactic annotation of written Dutch: Lassy. In Peter Spyns & Jan Odijk (eds.), Essential Speech and Language Technology for Dutch, 147–164. Springer, Berlin.
Oostdijk, Nelleke, Martin Reynaert, Véronique Hoste & Ineke Schuurman. 2013. The construction of a 500-million-word reference corpus of contemporary written Dutch. In Peter Spyns & Jan Odijk (eds.), Essential speech and language technology for Dutch, 219–247. Berlin: Springer.
Cited by (2)
Cited by two other publications
Pérez-Guerra, Javier
2022. Determinants of exaptation in Verb-Object predicates in the transition from Late Middle English to Early Modern English. In Broadening the Spectrum of Corpus Linguistics [Studies in Corpus Linguistics, 105], ► pp. 133 ff.
This list is based on CrossRef data as of 24 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
