In:Diachronic Corpora, Genre, and Language Change
Edited by Richard J. Whitt
[Studies in Corpus Linguistics 85] 2018
► pp. 219–240
Variation of sentence length across time and genre
Influence on syntactic usage in English
Published online: 8 November 2018
https://doi.org/10.1075/scl.85.10rud
https://doi.org/10.1075/scl.85.10rud
The goal of this paper is threefold: (i) to present some practical aspects of using the full-text version of the
Corpus of Historical American English (COHA), the largest diachronic multi-genre corpus of the
English language, in the investigation of a linguistic trend of change; (ii) to test a widely held assumption that
sentence length in written English has been steadily decreasing over the past few centuries; (iii) to point to a
possible link between changes in sentence length and changes in English syntactic usage. The empirical proof of
concept for (iii) is provided by the decline in the frequency of the non-finite purpose subordinator in order
to. Sentence length, genre and the likelihood of occurrence of in order to are shown to
be interrelated.
Article outline
- 1.Introduction
- 2.Sentence length in written English: The diachronic evolution across genres
- 2.1Just a matter of punctuation conventions?
- 3.A comprehensive analysis of sentence length in the time period of 1800–2000
- 3.1Design of the analysis and methodology
- 3.1.1Full-text COHA
- 3.1.2Genres in COHA
- 3.1.3Sentence tokenisation: Methodology
- 3.2Results
- 3.3Discussion
- 3.1Design of the analysis and methodology
- 4.Sentence length and syntactic usage
- 5.Conclusions
Notes Corpora References
References (29)
Davies, Mark. 2008–. The Corpus of Contemporary American English (COCA): 520 million words, 1990–present. <[URL]>
. 2010–. The Corpus of Historical American English (COHA): 400 million words, 1810–2009. <[URL]>
Biber, Douglas & Gray, Bethany. 2010. Challenging stereotypes about academic writing: Complexity, elaboration,
explicitness. Journal of English for Academic Purposes 9: 2–20.
Culpeper, Jonathan & Kytö, Merja. 2010. Early Modern English
Dialogues: Spoken Interaction as
Writing. Cambridge University Press.
Davis, Edith A. 1937. Mean sentence length compared with long and short sentences as a reliable measure of language
development. Child Development 8(1): 69–79.
Dorgeloh, Heidrun. 2005. Patterns of agentivity and narrativity in early science discourse. In Opening Windows in Discourse and Texts from the Past [Pragmatics & Beyond New Series 134], Janne Skaffari, Matti Peikola, Ruth Carroll, Risto Hiltunen & Brita Warvik (eds), 83–94. Amsterdam: John Benjamins.
Fries, Udo. 2010. Sentence length, sentence complexity and the noun phrase in 18th-century news
publications. In Language Change and Variation from Old English to Late Modern English [Linguistic Insights 114], Merja Kytö, John Scahill & Harumi Tanabe (eds), 21–33. Bern: Peter Lang.
Gross, Alan G., Harmon, Joseph E. & Reidy, Michael S. 2002. Communicating Science: The Scientific Article from the 17th Century to the Present. Oxford: OUP.
Hames, Tim & Rae, Nicol C. 1996. Governing America: History, Culture, Institutions, Organisation, Policy. Manchester: Manchester University Press.
Hilpert, Martin. 2013. Constructional Change in English: Developments in Allomorphy, Word Formation, and Syntax. Cambridge: CUP.
Hundt, Marianne, Denison, David & Schneider, Gerold. 2012. Relative complexity in scientific discourse. English Language and Linguistics 16(2): 209–240.
Leech, Geoffrey, Hundt, Marianne, Mair, Christian & Smith, Nicholas. 2009. Change in Contemporary English. A Grammatical Study. Cambridge: CUP.
Lewis, Edwin H. 1894. The History of the English Paragraph. Chicago IL: The University of Chicago Press.
Mair, Christian. 1998. Corpora and the study of major varieties in English: Issues and results. In The Major Varieties of English: Papers from MAVEN 97, Hans Lindquist, Staffan Klintborg, Magnus Levin & Maria Estling (eds), 139–157. Växjö: Acta Wexionensia.
McGregor, Gordon P. 2002. English for Life? Teaching English as a Second Language in Sub-Saharan Africa with Special Reference to
Uganda. Kampala: Fountain Publishers.
Rudnicka, Karolina. In preparation. In preparation. The Statistics of Obsolescence: Purpose Subordinators in Late Modern English. PhD dissertation, Albert-Ludwigs-Universität Freiburg.
Säily, Tanja, Vartiainen, Tanja & Siirtola, Harri. 2017. Exploring part-of- speech frequencies in a sociohistorical corpus of English. In Exploring Future Paths for Historical Sociolinguistics [Advances in Historical Sociolinguistics 7], Tanja Säily, Arja Nurmi, Minna Palander-Collin & Anita Auer (eds). Amsterdam: John Benjamins.
Schmidtke-Bode, Karsten. 2009. A Typology of Purpose Clauses [Typological Studies in Language 88]. Amsterdam: John Benjamins.
Schneider, Kristina. 2002. The Development of Popular Journalism in England from 1700 to the Present, Corpus Compilation and
Selective Stylistic Analysis. PhD dissertation, Universität Rostock.
Štajner, Sanja & Mitkov, Ruslan. 2012. Diachronic changes in text complexity in 20th century English language: An NLP
approach. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC
2012), Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet U. Dogan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk & Stelios Piperidis (eds), 1577–1584.
Cited by (17)
Cited by 17 other publications
Krielke, Marie-Pauline
2024. Cross-linguistic Dependency Length Minimization in scientific language. Languages in Contrast 24:1 ► pp. 133 ff.
Pilgrim, Charlie, Weisi Guo & Thomas T. Hills
Zhang, Ran, Jihed Ouni & Steffen Eger
ABDURRAHMAN, Israa Burhanuddin, Aya Qasim HASAN & Ali Hussein HAZEM
Mücke, Justin, Daria Waldow, Luise Metzger, Philipp Schauz, Marcel Hoffman, Nicolas Lell & Ansgar Scherp
Schneider, Gerold & Maud Reveilhac
2023. Colloquialisation, compression and democratisation in British parliamentary debates. In Exploring Language and Society with Big Data [Studies in Corpus Linguistics, 111], ► pp. 336 ff.
Alemany-Puig, Lluís & Ramon Ferrer-i-Cancho
Zhu, Haoran, Xueying Liu & Nana Pang
Kranich, Svenja & Tine Breban
Rudnicka, Karolina
Rudnicka, Karolina
2021.
So-adj-a construction as a case of obsolescence in progress. In Lost in Change [Studies in Language Companion Series, 218], ► pp. 51 ff.
Rudnicka, Karolina
Tsizhmovska, Natalia L. & Leonid M. Martyushev
Lorenz, David
2020. Converging variations and the emergence of horizontal
links. In Nodes and networks in Diachronic Construction Grammar [Constructional Approaches to Language, 27], ► pp. 243 ff.
[no author supplied]
[no author supplied]
2021. So-adj-a construction as a case of obsolescence in progress. In Lost in Change [Studies in Language Companion Series, 218],
This list is based on CrossRef data as of 1 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
