In:Urban Matters: Current approaches in variationist sociolinguistics
Edited by Arne Ziegler, Stefanie Edler and Georg Oberdorfer
[Studies in Language Variation 27] 2021
► pp. 253–278
Get fulltext
Testing models of diffusion of morphosyntactic innovations in Twitter data
Available under the Creative Commons Attribution-NonCommercial-NoDerivatives (CC BY-NC-ND) 4.0 license.
For any use beyond this license, please contact the publisher at rights@benjamins.nl.
Published online: 16 December 2021
https://doi.org/10.1075/silv.27.11bla
https://doi.org/10.1075/silv.27.11bla
Abstract
Established models of the spatial diffusion of linguistic innovations vary in their relationship to population density. Differences in prediction between the gravity models (Trudgill 1974), in which probability of diffusion is sensitive to settlement size, and the traditional wave models can be challenging to test due to the difficulty of large-scale and finely-grained geographical sampling. This paper tests the suitability of data derived from Twitter in establishing diffusion patterns. Using two case studies from British English – variation in the realisation of ditransitives, and preposition drop with go – we propose that the correlation between (local) population density and linguistic similarity to geographical neighbours can be used as a measure of hierarchical patterning for an individual innovation.
Article outline
- 1.Introduction
- 2.Methodology and corpus construction
- 2.1Corpus structure
- 2.1.1Localisation
- 2.1Corpus structure
- 3.Mapping the distribution of morphosyntactic variants
- 3.1Dative alternation revisited
- 3.2Preposition drop
- 4.Approaches to the quantification of diffusion
- 4.1Measurement
- 4.1.1Simulated data
- 4.2Evaluating real data
- 4.1Measurement
- 5.Conclusions
Notes References
References (49)
Bailey, Guy et al. 1993. Some patterns of linguistic diffusion. Language Variation and Change 5(3). 359–390.
Bailey, Laura R. 2018. Some characteristics of Southeast English preposition dropping. Iberia: An International Journal of Theoretical Linguistics 10. 48–70.
Bamman, David, Jacob Eisenstein and Tyler Schnoebelen. 2014. Gender identity and lexical variation in social media. Journal of Sociolinguistics 18(2). 135–160.
Biggs, Alison. 2014. Passive variation in the dialects of Northwest British English. Paper presented at the
3rd Conference of the International Society for the Linguistics of English (ISLE), University of Zürich, 24–27 August. [URL]. (4 February 2020)
Biggs, Alison. 2015. A new case for A-movement in Northwest British English. In Ulrike Steindl et al. (eds.), Proceedings of the 32nd West Coast Conference on Formal Linguistics (WCCFL 32), 218–227. Somerville, MA: Cascadilla Proceedings Project.
. 2016. Locating variation in the dative alternation. Linguistic Variation 16(2). 151–182.
Bresnan, Joan W. and Marilyn Ford. 2010. Predicting syntax. Processing dative constructions in American and Australian varieties of English. Language 86(1). 168–213.
Burridge, James. 2018. Unifying models of dialect spread and extinction using surface tension dynamics. Royal Society Open Science 5(1).
Doyle, Gabriel. 2014. Mapping dialectal variation by querying social media. In Shuly Wintner, Sharon Goldwater and Stefan Riezler (eds.), Proceedings of the 14th conference of the European chapter of the Association for Computational Linguistics, 98–106. Gothenburg: Association for Computational Linguistics. [URL].
Eisenstein, Jacob. 2018. Identifying regional dialects in on-line social media. In Charles Boberg, John Nerbonne and Dominic Watt (eds.), The handbook of dialectology, 368–383. Hoboken, NJ: Wiley–Blackwell.
Gast, Volker. 2007. I gave it him – on the motivation of the ‘alternative double object construction’ in varieties of British English. Functions of Language 14(1). 31–56.
Gerwin, Johanna. 2013. Give it me!: Pronominal ditransitives in English dialects. English Language and Linguistics 17(3). 445–463.
Gonçalves, Bruno and David Sánchez. 2014. Crowdsourcing dialect characterization through Twitter. PLoS ONE 9(11).
Grieve, Jack et al. 2019. Mapping lexical dialect variation in British English using Twitter. Frontiers in Artificial Intelligence 2. Article 11.
Grieve, Jack, Andrea Nini and Diansheng Guo. 2017. Analyzing lexical emergence in Modern American English online. English Language and Linguistics 21(1). 99–127.
Haddican, William. 2010. Theme-goal ditransitives and theme passivisation in British English dialects. Lingua 120(10). 2424–2443.
Haddican, William and Daniel E. Johnson. 2012. Effects on the particle verb alternation across English dialects. University of Pennsylvania Working Papers in Linguistics 18(2). 31–40.
Hägerstrand, Torsten. 1952. The propagation of innovation in waves (Lund Studies in Geography, Ser. B, Human Geography, 4). Lund: Royal University of Lund, Department of Geography.
Hall, David. 2019. P D drop and pseuo-incorporation in London English. In Maggie Baird and Jonathan Pesetsky (eds.), NELS 49: Proceedings of the Forty-Ninth Annual Meeting of the North East Linguistic Society: Volume 2, 85–89. Amherst, MA: GLSA.
Hecht, Brent and Monica Stephens. 2014. A tale of cities: Urban biases in volunteered geographic information. In Eytan Adar and Paul Resnick (eds.), Proceedings of the Eighth International AAAI Conference on Weblogs and Social Media, 197–205. Palo Alto, CA: Association for the Advancement of Artificial Intelligence.
Huang, Yuan et al. 2016. Understanding U.S. regional linguistic variation with Twitter data analysis. Computers, Environment and Urban Systems 59. 244–255.
Jones, Taylor. 2015. Toward a description of African American Vernacular English dialect regions using “Black Twitter”. American Speech 90(4). 403–440.
Labov, William. 2001. Principles of linguistic change, vol. 2: Social factors (Language in Society 29). Malden, MA: Wiley–Blackwell.
Malik, Momin et al. 2015. Population bias in geotagged tweets. In Derek Ruths and Jürgen Pfeffer (eds.), Standards and Practices in Large-Scale Social Media Research: Papers from the 2015 ICWSM Workshop. [URL]. (10 January 2020.)
Myler, Neil. 2013. On coming the pub in the North West of England: Accusative unaccusatives, dependent case, and preposition incorporation. Journal of Comparative Germanic Linguistics 16(2/3). 189–207.
Nguyen, Dong et al. 2014. Why gender and age prediction from tweets is hard. Lessons from a crowdsourcing experiment. In Junichi Tsujii and Jan Hajić (eds.), Proceedings of COLING 2014, the 25th international conference on computational linguistics: Technical papers, 1950–1961. Dublin: Dublin City University and Association for Computational Linguistics.
Office For National Statistics, Geography Division. 2016. Index of place names in Great Britain (July 2016). [URL]
Olsson, Gunnar. 1965. Distance and human interaction. A review and bibliography (
Bibliography Series 2
). Philadelphia, PA: Regional Science Research Institute.
Ordnance Survey Ireland. 2016. Townlands – OSi national placenames gazetteer. [URL]. (10 January, 2020)
Orton, Harold, Stewart Sanderson and John D. A. Widdowson. 1978. The linguistic atlas of England. London: Croom Helm.
Pavalanathan, Umashanthi and Jacob Eisenstein. 2015. Confounds and consequences in geotagged Twitter data. In Lluís Màrquez, Chris Callison-Burch, Jian Su (eds.), Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2138–2148. Stroudsburg, PA: Association for Computational Linguistics. . [URL]
Reis, Stefan. et al. 2017. UK gridded population 2011 based on Census 2011 and Land Cover Map 2015. NERC Environmental Information Data Centre. (Dataset).
Russ, Brice. 2012. Examining large-scale regional variation through online geotagged corpora. Paper presented at the
Annual Meeting of the American Dialect Society
, Portland. [URL]
Schmidt, Johannes. 1872. Die verwantschaftsverhältnisse der indoger manischen sprachen. Weimar: Böhlau.
Shoemark, Philippa et al. 2017. Aye or naw, whit dae ye hink? Scottish independence and linguistic identity on social media. In Mirella Lapata, Phil Blunsom and Alexander Koller (eds.), Proceedings of the 15th conference of the European chapter of the association for computational linguistics, vol. 1, long papers. Stroudsburg, PA: Association for Computational Linguistics.
Siewierska, Anna and Willem B. Hollmann. 2007. Ditransitive clauses in English with special reference to Lancashire dialect. In Mike Hannay and Gerard J. Steen (eds.), Structural-functional studies in English grammar. In honour of Lachlan Mackenzie, vol. 83: Studies in Language Companion Series, 83–102. Amsterdam: John Benjamins.
Stevenson, Jonathan. 2016. Dialect in digitally mediated written interaction: A survey of the geohistorical distribution of the ditransitive in British English using Twitter. Master’s Thesis. York: University of York.
Strelluf, Christopher. 2019. Anymore, it’s on Twitter. Positive anymore, American regional dialects, and polarity licensing in tweets. American Speech 94(3). 313–351.
Szmrecsanyi, Benedikt. 2013. Grammatical variation in British English dialects. A study in corpus-based dialectometry. Cambridge: Cambridge University Press.
Trudgill, Peter. 1974. Linguistic change and diffusion: Description and explanation in sociolinguistic dialect geography. Language in Society 3(2). 215–246.
Upton, Clive and John D. A. Widdowson. 1996. An atlas of English dialects: Region and dialect. Oxford: Oxford University Press.
Wikle, Thomas and Guy Bailey. 1997. The spatial diffusion of linguistic features in Oklahoma. Proceedings of the Oklahoma Academy of Science 77. 1–15.
Willis, David. 2020. Using social-media data to investigate morphosyntactic variation and dialect syntax in a lesser-used language: Two case studies from Welsh. Glossa: a journal of general linguistics 5(1). 103.
Wolk, Christoph et al. 2013. Dative and genitive variability in Late Modern English. Exploring cross-constructional variation and change. Diachronica 30(3). 382–419.
