Article published In: International Journal of Corpus Linguistics
Vol. 27:3 (2022) ► pp.259–290
Lectal contamination
Evidence from corpora and from agent-based simulation
Published online: 13 June 2022
https://doi.org/10.1075/ijcl.20040.pij
https://doi.org/10.1075/ijcl.20040.pij
Abstract
This paper presents evidence from both corpora and agent-based simulation for the effect of lectal contamination.
By doing so, it shows how agent-based simulation can be used as a complementary technique to corpus research in the study of
language variation. Lectal contamination is an effect whereby the words that are typical of a language variety more often appear
in a morphosyntactic variant typical of that same variety, even among language use from a different variety. This study looks at
the Dutch partitive genitive construction, which exhibits variation between a “Netherlandic” variant with -s
ending and a “Belgian” variant without -s ending. It is shown that the probability of the Belgian variant without
-s increases among more “Belgian” words, in the language use of both Belgians and people from the Netherlands. Meanwhile, an
agent-based simulation reveals the crucial theoretical preconditions that lead to this effect.
Article outline
- 1.Introduction
- 2.Preconditions and predictions of lectal contamination
- 3.Corpus study
- 3.1Data
- 3.2Analyses
- 4.Agent-based simulation
- 4.1Design and evaluation
- 4.2Implementation
- 4.3Results
- 5.Discussion and conclusions
- Acknowledgments
- Notes
References
References (72)
Algeo, J. (2006). British
or American English?: A Handbook of Word and Grammar Patterns. Cambridge University Press.
Arnon, I., & Snider, N. (2010). More
than words: frequency effects for multi-word phrases. Journal of Memory and
Language, 62(1), 67–82.
Bates, D., Maechler, M., Bolker, B., & Walker, S. (2013). lme4: Linear mixed-effects models using Eigen and S4 (Version 1.4) [Computer software]. [URL]
Beckner, C., Blythe, R., Bybee, J., Christiansen, M., Croft, W., Ellis, N., Holland, J., Ke, J., Larsen-Freeman, D., & Schoenemann, T. (2009). Language
is a complex adaptive system: Position paper. Language
Learning, 59(1), 1–26.
Bentivoglio, P., & Sedano, M. (2011). Morphosyntactic
variation in Spanish–speaking Latin America. In M. Díaz-Campos (Ed.), The
Handbook of Hispanic
Sociolinguistics (pp. 123–147). Blackwell.
Beuls, K., & Steels, L. (2013). Agent-based
models of strategies for the emergence and evolution of grammatical agreement. PLoS
ONE, 8(3), e58960.
Bloem, J., Versloot, A., & Weerman, F. (2015). An
agent-based model of a historical word order change. In R. Berwick, A. Korhonen, A. Lenci, T. Poibeau, & A. Villavicencio (Eds.), Proceedings
of the Sixth Workshop on Cognitive Aspects of Computational Language
Learning (pp. 22–27). Association for Computational Linguistics.
Blythe, R., & Croft, W. (2012). S-curves
and the mechanisms of propagation in language
change. Language, 88(2), 269–304.
Broekhuis, H. (2013). Syntax
of Dutch: Adjectives and Adjective Phrases. Amsterdam University Press.
(2013). Usage-based
theory and exemplar representations of constructions. In T. Hoffmann & G. Trousdale (Eds.), The
Oxford Handbook of Construction
Grammar (pp. 49–69). Oxford University Press.
Centraal Bureau voor de
Statistiek. (n.d.). Retrieved March 13, 2020, from [URL]
Claes, J. (2015). Competing
constructions: The pluralization of presentational haber in Dominican Spanish. Cognitive
Linguistics, 26(1), 1–30.
Colleman, T. (2009). Verb
disposition in argument structure alternations: A corpus study of the dative alternation in
Dutch. Language
Sciences, 31(5), 593–611.
Dąbrowska, E. (2014). Recycling
utterances: A speaker’s guide to sentence processing. Cognitive
Linguistics, 25(4), 617–653.
Daems, J., Heylen, K., & Geeraerts, D. (2015). Wat dragen we vandaag: een hemd met blazer of een shirt met jasje? [What do we wear today: A ‘hemd’ with a ‘blazer’ or a ‘shirt’ with a ‘jasje’?] Taal
En
Tongval, 67(2), 307–342.
Davies, M. (2004). British
National Corpus (from Oxford University Press). Retrieved January, 2020, from [URL]
(2008–). The
Corpus of Contemporary American English (COCA). Retrieved January, 2020, from [URL]
De Vylder, B. (2007). The
Evolution of Conventions in Multi-agent Systems [Doctoral
dissertation, Vrije Universiteit Brussel]. [URL]
Diessel, H. (2015). Usage-based
construction grammar. In E. Dąbrowska & D. Divjak (Eds.), Handbook
of Cognitive
Linguistics (pp. 296–322). De Gruyter Mouton.
(2019). The
Grammar Network: How Linguistic Structure Is Shaped by Language Use. Cambridge University Press.
Dürscheid, C., Elspaß, S. & Ziegler, A. (Eds.). (2018). Variantengrammatik des Standarddeutschen. Ein Online-Nachschlagewerk [Variant grammar of Standard German. An online reference work]. [URL]
Fagyal, Z., Swarup, S., Escobar, A. M., Gasser, L., & Lakkaraju, K. (2010). Centers
and peripheries: Network roles in language
change. Lingua, 120(8), 2061–2079.
Fox, J., Weisberg, S., Friendly, M., Hong, J., Andersen, R., Firth, D., & Taylor, S. (2016). Effect Displays for Linear, Generalized Linear, and Other Models (Version 3.2) [Computer software]. [URL]
Geeraerts, D., Grondelaers, S., & Speelman, D. (1999). Convergentie en divergentie in de Nederlandse woordenschat: een onderzoek naar kleding- en
voetbaltermen [Convergence and divergence in Dutch vocabulary: A study into
clothing and football terminology]. P. J. Meertens-Instituut.
Gelman, A., & Hill, J. (2007). Data
Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.
Haeseryn, W. (2013). Belgian
Dutch. In F. Hinskens & J. Taeldeman (Eds.), Language
and Space:
Dutch (pp. 700–720). De Gruyter Mouton.
Haeseryn, W., Romijn, K., Geerts, G., de Rooij, J., & van den Toorn, M. (1997). Algemene Nederlandse Spraakkunst [General Dutch
Grammer]. Nijhoff.
Harrell, F. J. (2017). Hmisc: Harrell Miscellaneous (Version 4.0-3) [Computer software]. [URL]
Hay, J. (2018). Sociophonetics:
The role of words, the role of context, and the role of words in context. Topics in Cognitive
Science, 10(4), 696–706.
Hay, J., Walker, A., Sanchez, K., & Thompson, K. (2019). Abstract
social categories facilitate access to socially skewed words. PLoS
ONE, 14(2), e0210793.
Hilpert, M., & Flach, S. (forthcoming). A
case of constructional contamination in English: Modified noun phrases influence adverb placement in the
passive. In M. Grygiel (Ed.), Contrast
and Analogy in Language: Perspectives from Cognitive Linguistics. John Benjamins.
Höder, S. (2014). Constructing
diasystems: Grammatical organisation in bilingual groups. In T. Åfarli & B. Mæhlum (Eds.), The
Sociolinguistics of
Grammar (pp. 137–152). John Benjamins.
(2018). Grammar
is community-specific: Background and basic concepts of Diasystematic Construction
Grammar. In H. Boas & S. Höder (Eds.), Constructions
in Contact Constructional Perspectives on Contact Phenomena in Germanic
languages (pp. 37–70). Benjamins.
Jaeger, H., Steels, L., Baronchelli, A., Briscoe, T., Christiansen, M., Griffiths, T., Jäger, G., Kirby, S., Komarova, N., Peter, R., & Jochen, T. (2009). What
can mathematical, computational, and robotic models tell us about the origins of
syntax? In Biological Foundations and Origin of
Syntax. The MIT Press.
Karjus, A., & Ehala, M. (2018). Testing
an agent-based model of language choice on sociolinguistic survey data. Language Dynamics and
Change, 8(2), 219–252.
Landsbergen, F. (2009). Cultural
Evolutionary Modeling of Patterns in Language Change: Exercises in Evolutionary
Linguistics. LOT.
Landsbergen, F., Lachlan, R., ten Cate, C., & Verhagen, A. (2010). A
cultural evolutionary model of patterns in semantic
change. Linguistics, 48(2), 363.
Levshina, N. (2015). How
to Do Linguistics with R: Data Exploration and Statistical Analysis. John Benjamins.
Meyer, D., Zeileis, A., & Hornik, K. (2020). vcd: Visualizing Categorical Data (Version 1.4-6) [Computer software]. [URL]
Oostdijk, N., Goedertier, W., Van Eynde, F., Boves, L., Martens, J.-P., Moortgat, M., & Baayen, H. (2002). Experiences
from the Spoken Dutch corpus project. Proceedings of the Third International Conference on
Language Resources and Evaluation
(LREC), 340–347.
Oostdijk, N., Reynaert, M., Hoste, V., & Schuurman, I. (2013a). SoNaR
User Documentation (version 1.0.4). [URL]
(2013b). The
construction of a 500-million-word reference corpus of contemporary written
Dutch. In P. Spyns & J. Odijk (Eds.), Essential
Speech and Language Technology for Dutch, Theory and Applications of Natural Language
Processing (pp. 219–247). Springer.
Perek, F., & Goldberg, A. E. (2015). Generalizing
beyond the input: The functions of the constructions matter. Journal of Memory and
Language, 841, 108–127.
Pérez-Martín, A. M. (2007). Pluralización de había en el habla de El Hierro: Datos
cuantitativos [Pluralization of había in the speech of El
Hierro: Quantitative data]. Revista de Filología de La Universidad de La
Laguna, 251, 505–513.
Phan, D., & Varenne, F. (2010). Agent-based
models and simulations in economics and social sciences. Journal of Artificial Societies and
Social
Simulation, 13(4), 1532.
Pijpops, D. (2019). How,
Why and Where Does Argument Structure Vary? A Usage-Based Investigation into the Dutch Transitive-Prepositional
Alternation [Doctoral dissertation, University of Leuven]. LIRIAS @ KU Leuven. [URL]
Pijpops, D., & Beuls, K. (2015). Agent-gebaseerde modellering in de historische taalkunde. Een model van regularisatiedruk op de Nederlandse
werkwoorden [Agent-based modelling in historical linguistics: A model of the
regularization pressure on Dutch verbs]. Handelingen Der Koninklijke Zuid-Nederlandse
Maatschappij Voor Taal- En Letterkunde En
Geschiedenis, 691, 5–23.
Pijpops, D., Beuls, K., & Van de Velde, F. (2015). The
rise of the verbal weak inflection in Germanic: An agent-based model. Computational Linguistics
in the Netherlands
Journal, 51, 81–102.
Pijpops, D., De Smet, I., & Van de Velde, F. (2018). Constructional
contamination in morphology and syntax: Four case studies. Constructions and
Frames, 10(2), 269–305.
Pijpops, D., & Van de Velde, F. (2015). Ethnolect
speakers and Dutch partitive adjectival inflection: A corpus analysis. Taal En
Tongval, 67(2), 343–371.
(2016). Constructional
contamination: How does it work and how do we measure it? Folia
Linguistica, 50(2), 543–581.
(2018). A
multivariate analysis of the partitive genitive in Dutch: Bringing quantitative data into a theoretical
discussion. Corpus Linguistics and Linguistic
Theory, 14(1), 99–131.
Plevoets, K. (2008). Tussen spreek- en standaardtaal. Een corpusgebaseerd onderzoek naar de situationele, regionale en sociale
verspreiding van enkele morfosyntactische verschijnselen uit het gesproken
Belgisch-Nederlands [Between language for speaking and standard language. A
corpus-based study to the situational, regional and social diffusion of a number of morphosyntactic features of spoken Belgian
Dutch] [Doctoral dissertation, University of Leuven]. LIRIAS @ KU Leuven. [URL]
Ruette, T. (2012). Aggregating
Lexical Variation: Towards Large-scale Lexical Lectometry [Doctoral
dissertation, University of Leuven]. LIRIAS @ KU Leuven. [URL]
Ruette, T., Ehret, K., & Szmrecsanyi, B. (2016). A
lectometric analysis of aggregated lexical variation in written Standard English with Semantic Vector Space
models. International Journal of Corpus
Linguistics, 21(1), 48–79.
Sonderegger, M., Wagner, M., & Torreira, F. (2018). Quantitative
Methods for Linguistic Data. [URL]
Speelman, Dirk. (2014). Logistic
regression: A confirmatory technique for comparisons in corpus
linguistics. In D. Glynn & J. A. Robinson (Eds.), Corpus
Methods for Semantics: Quantitative Studies in Polysemy and
Synonymy (pp. 487–533). John Benjamins.
Speelman, D., Heylen, K., & Geeraerts, D. (2018). Mixed-Effects
Regression Models in Linguistics. Springer.
Steels, L. (2011). Modeling
the cultural evolution of language. Physics of Life
Reviews, 8(4), 339–356.
(2000). Language
as a complex adaptive system. In M. Schoenauer, K. Deb, G. Rudolph, X. Yao, E. Lutton, J. J. Merelo, & H.-P. Schwefel (Eds.), Proceedings
of PPSN VI: Lecture Notes in Computer
Science (pp. 17–26). Springer.
Tremblay, A., & Baayen, R. H. (2010). Holistic
processing of regular four-word sequences: A behavioral and ERP study of the effects of structure, frequency, and probability
on immediate free recall. In D. Wood (Ed.), Perspectives
on Formulaic Language: Acquisition and
Communication (pp. 151–173). Continuum.
Tremblay, A., Derwing, B., Libben, G., & Westbury, C. (2011). Processing
advantages of lexical bundles: evidence from self-paced reading and sentence recall
tasks. Language
Learning, 61(2), 569–613.
van Agtmaal-Wobma, E., Harmsen, C., Dal, L., & Poulain, M. (2007). Belgen in Nederland en Nederlanders in België [Belgians in the
Netherlands and Dutchmen in Belgium]. Centraal Bureau voor Statistiek (CBS). [URL]
van der Horst, J. (2008). Geschiedenis van de Nederlandse syntaxis [History of Dutch
syntax]. Universitaire Pers Leuven.
van Eerten, L. (2007). Over
het Corpus Gesproken Nederlands [About the Corpus of Spoken Dutch]. Nederlandse
Taalkunde, 12(3), 194–215.
van Trijp, R., & Steels, L. (2012). Multilevel
alignment maintains language systematicity. Advances in Complex
Systems, 15(3–4).
Wellens, P. (2012). Adaptive
Strategies in the Emergence of Lexical Systems. Dissertation Vrije Universiteit Brussel.
Cited by (3)
Cited by three other publications
Leclercq, Benoît, Cameron Morin & Dirk Pijpops
Zehentner, Eva & Dirk Pijpops
This list is based on CrossRef data as of 12 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
