A serendipitous corpus-linguistic experiment: Chapter 7. Frequential test of (S)OV as unmarked word order in Dutch and German clauses

Kempen, Gerard; Harbusch, Karin

doi:10.1075/z.210.07kem

In:Crossroads Semantics: Computation, experiment and grammar
Edited by Hilke Reckman, Lisa Lai-Shen Cheng, Maarten Hijzelendoorn and Rint Sybesma
[Not in series 210] 2017
► pp. 107–123

Get fulltext from our e-platform

Download Book PDF

Chapter 7
Frequential test of (S)OV as unmarked word order in Dutch and German clauses

A serendipitous corpus-linguistic experiment

Gerard Kempen | Max Planck Institute for Psycholinguistics, Nijmegen | Cognitive Psychology Unit, Leiden University

Karin Harbusch | Department of Computer Science, University of Koblenz-Landau

Published online: 12 April 2017

https://doi.org/10.1075/z.210.07kem

Abstract

In a paper entitled “Against markedness (and what to replace it with)”, Haspelmath argues “that the term ‘markedness’ is superfluous”, and that frequency asymmetries often explain structural (un)markedness asymmetries (Haspelmath 2006). We investigate whether this argument applies to Object and Verb orders in main (VO, marked) and subordinate (OV, unmarked) clauses of spoken and written German and Dutch, using English (without VO/OV alternation) as control. Frequency counts from six treebanks (three languages, two output modalities) do not support Haspelmath’s proposal. However, they reveal an unexpected phenomenon, most prominently in spoken Dutch and German: a small set of extremely high-frequent finite verbs with unspecific meanings populates main clauses much more densely than subordinate clauses. We suggest these verbs accelerate the start-up of grammatical encoding, thus facilitating sentence-initial output fluency.

Keywords: SOV, SVO, markedness, German, Dutch, verb frequency, sentence planning, fluency, corpus linguistics, psycholinguistics

Article outline

1.Introduction
2.Methodology
3.Three frequential tests
4.Discussion: Time and fluency pressures can boost VO:OV ratios
Notes
References
Appendix

References (20)

References

Beek, Leonoor van der, Gosse Bouma, Robert Malouf & Gertjan van Noord. 2002. The Alpino Dependency Treebank. In Tanja Gaustad (ed.), Computational Linguistics in the Netherlands 2001. Amsterdam: Rodopi.

Brants, Sabine, Stefanie Dipper, Peter Eisenberg, Silvia Hansen-Schirra, Esther König, Wolfgang Lezius, Christian Rohrer, George Smith & Hans Uszkoreit. 2004. TIGER: Linguistic Interpretation of a German Corpus. Research on Language and Computation 2. 597–620.

Charniak, Eugene, Don Blaheta, Niyu Ge, Keith Hall, John Hale & Mark Johnson. 2000. BLLIP 1987–89 WSJ Corpus Release 1 LDC2000T43. DVD. Philadelphia: Linguistic Data Consortium.

Drach, Erich. 1937. Grundgedanken der deutschen Satzlehre. Frankfurt am Main: Diesterweg. [Reprinted in 1963]

Dryer, Matthew. 1995. Frequency and pragmatically unmarked word order. In Mickey Noonan & Pamela Downing (eds.), Word order in discourse, 105–135. Amsterdam: John Benjamins.

Eerten, Laura van. 2007. Over het Corpus Gesproken Nederlands. Nederlandse Taalkunde, 12. 194–215.

Fitz, Hartmut, Franklin Chang & Morten H. Christiansen. 2011. A connectionist account of the acquisition and processing of relative clauses. In Evan Kidd (ed.), The acquisition of relative clauses. Processing, typology and function, 39–60. Amsterdam: Benjamins.

Godfrey, John J., Eduard C. Holliman & Jane McDaniel. 1992. SWITCH-BOARD: Telephone speech corpus for research and development. In Proceedings of the International Conference on Audio, Speech and Signal Processing (ICASSP-92), 517–520.

Haspelmath, Martin. 2006. Against markedness (and what to replace it with). Journal of Linguistics 42. 25–70.

Haider, Hubert. 2010. Wie wurde Deutsch OV? Zur diachronen Dynamik eines Strukturparameters der germanischen Sprachen. In Arne Ziegler (ed.), Historische Textgrammatik und Historische Syntax des Deutschen – Traditionen, Innovationen, Perspektiven, 11–32. Berlin: De Gruyter.

Höhle, Tilman N. 1986. Der Begriff ‘Mittelfeld’: Anmerkungen über die Theorie der topologischen Felder. In Walter Weiss, Herbert E. Wiegand & Marga Reis (eds.), Akten des VII. Internationalen Germanistenkongresses, 329–340. Tübingen: Niemeyer.

Hoekstra, Heleen, Michael Moortgat, Ineke Schuurman & Ton van der Wouden. 2001. Syntactic annotation for the spoken Dutch corpus project (CGN). Language and Computers 37(1). 73–87.

Kempen, Gerard & Karin Harbusch. 2016. Verb-second word order after German weil ‘because’: Psycholinguistic theory from corpus-linguistic data. Glossa: a journal of general linguistics 1(1). 1–32.

König, Esther & Wolfgang Lezius. 2003. The TIGER language: A Description Language for Syntax Graphs, Formal Definition. Stuttgart: University of Stuttgart.

Koster, Jan. 1975. Dutch as an SOV Language. Linguistic analysis 1. 111–136.

MacDonald, Maryellen C., Jessica L. Montag & Silvia P. Gennari. 2016. Are there really syntactic complexity effects in sentence production? A reply to Scontras, et al. (2015). Cognitive Science, 40. 513–518.

Noord, Gertjan van, Gosse Bouma, Frank van Eynde, Daniël de Kok, Jelmar van der Linde, Ineke Schuurman, Erik Tjong Kim Sang, & Vincent Vandeghinste. 2013. Large scale syntactic annotation of written Dutch: Lassy. In Peter Spyns & Jan Odijk (eds.), Essential Speech and Language Technology for Dutch, 147–164. Springer, Berlin.

Oostdijk, Nelleke, Martin Reynaert, Véronique Hoste & Ineke Schuurman. 2013. The construction of a 500-million-word reference corpus of contemporary written Dutch. In Peter Spyns & Jan Odijk (eds.), Essential speech and language technology for Dutch, 219–247. Berlin: Springer.

Stegmann, Rosemary, Heike Telljohann & Erhard W. Hinrichs. 2000. Stylebook for the German Treebank in Verbmobil. Saarbrücken: DFKI Report 239.

Wahlster, Wolfgang (ed.). 2000. Verbmobil: Foundations of speech-to-speech translation. Berlin: Springer.

Cited by (2)

Cited by two other publications

Pérez-Guerra, Javier

2022. Determinants of exaptation in Verb-Object predicates in the transition from Late Middle English to Early Modern English. In Broadening the Spectrum of Corpus Linguistics [Studies in Corpus Linguistics, 105], ► pp. 133 ff.

Kempen, Gerard & Karin Harbusch

2019. Mutual attraction between high-frequency verbs and clause types with finite verbs in early positions: corpus evidence from spoken English, Dutch, and German. Language, Cognition and Neuroscience 34:9 ► pp. 1140 ff.

This list is based on CrossRef data as of 24 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.

Chapter 7Frequential test of (S)OV as unmarked word order in Dutch and German clauses

A serendipitous corpus-linguistic experiment

Cited by two other publications

Chapter 7
Frequential test of (S)OV as unmarked word order in Dutch and German clauses