Article published In: Current trends in analyzing syntactic variation:
Edited by Ludovic De Cuypere, Clara Vanderschueren and Gert de Sutter
[Belgian Journal of Linguistics 31] 2017
► pp. 137–164
Agreement mismatches in Dutch relatives
Published online: 23 April 2018
https://doi.org/10.1075/bjl.00006.bou
https://doi.org/10.1075/bjl.00006.bou
Abstract
This paper investigates agreement mismatches in Dutch relatives. While the norm is that singular neuter nouns occur with the relative
pronoun dat ‘that’, it is by now quite common to find neuter nouns combining with the relative pronoun
die. A large Twitter corpus is used to study which linguistic variables make die ‘that’ in this context
more likely. Lack of agreement between neuter noun and relative pronoun is very frequent in this corpus (37.5% of the cases, 46.8% if the
preceding determiner is indefinite). Non-agreement is most common for nouns that are high in the animacy ranking, but it also occurs with
other semantic classes, and there is quite a bit of lexical variation. Young, female users have a stronger tendency to use non-agreeing
relative pronouns. Contrary to what previous work suggests, we do not find that users with a Moroccan or Turkish background have a stronger
tendency towards non-agreement. A comparison of tweets with agreeing and non-agreeing pronouns and a comparison of the Twitter corpus with
web data both suggest that non-agreement is characteristic of informal language use.
Article outline
- 1.Introduction
- 2.Previous work
- 2.1Dutch nominal agreement
- 2.2Language use on social media
- 3.Corpus construction
- 4.Linguistic variation
- 5.Demographic variation
- 6.Formality
- 7.Conclusion
- Notes
References
References (45)
Alis, Christian M., and May T. Lim. 2013. “Spatio-Temporal Variation of Conversational Utterances on Twitter”. PLOS ONE 8 (10): e77793.
Argamon, Shlomo, Moshe Koppel, James W. Pennebaker, and Jonathan Schler. 2007. “Mining the Blogosphere: Age, Gender and the Varieties of Self-Expression”. First Monday, 12 (9).
Audring, Jenny. 2006. “Pronominal Gender in Spoken Dutch”. Journal of Germanic Linguistics 18 (2): 85–116.
Baldwin, Tim, Paul Cook, Marco Lui, Andrew MacKinlay, and Li Wang. 2013. “How Noisy Social Media Text, how dffrnt Social Media Sources”. International Joint Conference on Natural Language Processing.
Bamman, David, Jacob Eisenstein, and Tyler Schnoebelen. 2014. “Gender Identity and Lexical Variation in Social Media”. Journal of Sociolinguistics 18 (2): 135–160.
Barbiers, Sjef, Leonie Cornips, and Jan Pieter Kunst. 2007. “The Syntactic Atlas of the Dutch Dialects (sand): a Corpus of Elicited Speech and Text as an Online Dynamic Atlas. In Creating and digitizing language corpora, ed. by Joan Beal, Karen Corrigan, and Hermann Moisl, 54–90. Palgrave McMillan, New York.
Biemann, Chris, Felix Bildhauer, Stefan Evert, Dirk Goldhahn, Uwe Quasthoff, Roland Schäfer, Johannes Simon, Leonard Swiezinski, and Torsten Zesch. 2013. “Scalable Construction of High-Quality Web Corpora”. Journal for Language Technology and Computational Linguistics 28 (2): 23–60.
Bouma, Gosse. 2015. “N-gram Frequencies for Dutch Twitter Data.” Computational Linguistics in the Netherlands Journal 51: 25–36.
Brants, Thorsten, and Alex Franz. 2009. Web 1T 5-gram, 10 European Languages Version 1 LDC2009T25. Linguistic Data Consortium, Philadelphia, [URL].
Brysbaert, Marc, Michaël Stevens, Simon De Deyne, Wouter Voorspoels, and Gert Storms. 2014. “Norms of Age of Acquisition and Concreteness for 30,000 Dutch Words.” Acta Psychologica 1501: 80–84.
Cornips, Leonie. 2002. “Ethnisch Nederlands.” In Een buurt in beweging: talen en culturen in het Utrechtse Lombok en Transvaal, ed. by H. Bennis, G. Extra, P. Muysken, and J. Nortier, 285–302. Stichting Beheer IISG, Amsterdam.
. 2008. “Loosing Grammatical Gender in Dutch: The Result of Bilingual Acquisition and/or an Act of Identity?” International Journal of Bilingualism 12 (1–2): 105–124.
Cornips, Leonie, Mara van der Hoek, and Ramona Verwer. 2006. “The Acquisition of Grammatical Gender in Bilingual Child Acquisition of Dutch (by Older Moroccan and Turkish Children). The Definite Determiner, Attributive Adjective and Relative Pronoun.” In Linguistics in The Netherlands. Amsterdam: John Benjamins.
De Decker, Benny, and Reinhild Vandekerckhove. 2012. “Stabilizing Features in Substandard Flemish: The Chat Language of Flemish Teenagers as a Test Case.” Zeitschrift für Dialektologie und Linguistik 79 (2): 129–148.
De Vogelaer, Gunther, and Gert De Sutter. 2011. “The Geography of Gender Change: Pronominal and Adnominal Gender in Flemish Dialects of Dutch.” Language Sciences 33 (1): 192–205.
De Vos, Lien. 2013. “On Variation in Gender Agreement: The Neutralization of Pronominal Gender in Dutch.” Synchrony and Diachrony: A dynamic interface 1331: 237–260.
De Vos, Lien, and Gunther De Vogelaer. 2011. “Dutch Gender and the Locus of Morphological Regularization.” Folia Linguistica 45 (2): 245–281.
Eisenstein, Jacob. 2013a. “What to Do about Bad Language on the Internet.” In Proceedings of NAACL-HLT, Association for Computational Linguistics, Atlanta, 359–369.
. 2013b. “Phonological Factors in Social Media Writing.” NAACL 2013, Association for Computational Linguistics, Atlanta, 11–19.
Eisenstein, Jacob, Brendan O’Connor, Noah A Smith, and Eric P Xing. 2014. Diffusion of Lexical Change on Social Media.” PLOS ONE 9 (1).
Geerts, Guido, Walter Haeseryn, Jaap de Rooij, and Maarten C. van den Toorn. 1984. Algemene Nederlandse Spraakkunst. Groningen: Wolters-Noordhoff.
Hu, Yuheng, Kartik Talamadupula, and Subbarao Kambhampati. 2013. “Dude, srsly? The Surprisingly Formal Nature of Twitter’s Language.” In 7th international AAAI conference on web logs and social media (ICWS), Association for the Advancement of Artificial Intelligence.
Jurafsky, Dan, Victor Chahuneau, Bryan R. Routledge, and Noah A. Smith. 2014. “Narrative Framing of Consumer Sentiment in Online Restaurant Reviews.” First Monday 19 (4).
Kraaikamp, Margot. 2012. “The Semantics of the Dutch Gender System. Journal of Germanic Linguistics 24 (03): 193–232.
. 1990. “The Intersection of Sex and Social Class in the Course of Linguistic Change.” Language variation and change 2 (02): 205–254.
Lemmens, Maarten. 2013. “Van (neutraal) tussenwerpsel naar (positief) evaluatief adjectief: ça va en oké in het Nederlands.” Internationale Linguistiek 11: 5–28.
Malvern, David, and Brian Richards. 2012. “Measures of Lexical Richness.” The Encyclopedia of Applied Linguistics. Wiley Online.
Monroe, Burt L., Michael P Colaresi, and Kevin M Quinn. 2008. “Fightin’ Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict.” Political Analysis 16 (4): 372–403.
Nguyen, Dong, Noah A Smith, and Carolyn P Rosé. 2011. “Author Age Prediction from Text Using Linear Regression.” In Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, 115–123. Atlanta: Association for Computational Linguistics.
Nguyen, Dong, Dolf Trieschnigg, and Theo Meder. 2013. “Tweetgenie: Development, Evaluation, and Lessons Learned.” In ACM Sigweb Newsletter 4.
Nguyen, Dong, A Seza Doğruöz, Carolyn P Rosé, and Franciska de Jong. 2015. “Computational Sociolinguistics: A Survey.” Computational Linguistics 42 (3): 537–593.
Oostdijk, Nelleke. 2000. “The Spoken Dutch Corpus: Overview and first evaluation.” In Proceedings of LREC 2000, 887–894, Athens: European Language Resources Association.
Rao, Delip, David Yarowsky, Abhishek Shreevats, and Manaswi Gupta. 2010. “Classifying Latent User Attributes in Twitter.” In Proceedings of the 2nd international workshop on Search and mining user-generated contents, 37–44. Association for Computing Machinery.
Tagliamonte, Sali A. 2011. Variationist Sociolinguistics: Change, Observation, Interpretation. Oxford: John Wiley & Son.
Tjong Kim Sang, Erik. 2011. “Het gebruik van Twitter voor taalkundig onderzoek.” TABU: Bulletin voor Taalwetenschap 39 (1/2): 62–72.
Tjong Kim Sang, Erik, and Antal van den Bosch. 2013. “Dealing with Big Data: The Case of Twitter.” Computational Linguistics in the Netherlands Journal 31: 121–134.
Unsworth, Sharon, and Aafke Hulk. 2010. “L1 Acquisition of Neuter Gender in Dutch: Production and Judgement.” In Language acquisition and development: proceedings of GALA 2009. Cambridge: Cambridge Scholars.
van Halteren, Hans, and Nelleke Oostdijk. 2014. “Variability in Dutch Tweets. An Estimate of the Proportion of Deviant Word Tokens”. Journal of Language Technology and Computational Linguistics 29 (2): 97–124.
van Noord, Gertjan. 2006. “At Last Parsing is Now Operational”. In TALN06. Verbum Ex Machina. Actes de la 13e conference sur le traitement automatique des langues naturelles, ed. by Piet Mertens, Cedrick Fairon, Anne Dister, and Patrick Watrin, 20–42. Louvain: Presses Universitaires de Louvain.
van Noord, Gertjan, Gosse Bouma, Frank van Eynde, Daniel de Kok, Jelmer van der Linde, Ineke Schuurman, Erik Tjong Kim Sang, and Vincent Vandeghinste. 2013. “Large Scale Syntactic Annotation of Written Dutch: Lassy”. In Essential Speech and Language Technology for Dutch: the STEVIN Programme, ed. by Peter Spyns, and Jan Odijk 147–164. Berlin: Springer.
Zaenen, Annie, Jean Carletta, Gregory Garretson, Joan Bresnan, Andrew Koontz-Garboden, Tatiana Nikitina, M Catherine O’Connor, and Tom Wasow. 2004. “Animacy Encoding in English: Why and How”. In Proceedings of the 2004 ACL workshop on discourse annotation, 118–125. Atlanta: Association for Computational Linguistics.
Cited by (2)
Cited by two other publications
Doreleijers, Kristel, Jeroen Van Craenenbroeck & Marjo Van Koppen
This list is based on CrossRef data as of 19 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
