Cover not available

In:Corpus Approaches to Social Media:
Edited by Sofia Rüdiger and Daria Dayter
[Studies in Corpus Linguistics 98] 2020
► pp. 111130

References (22)
References
Baumgartner, Jason. n.d. Reddit Comment Corpus. <[URL]> (27 March 2020).
Berber Sardinha, Tony. 2014. Comparing internet and pre-internet registers. In Multi-dimensional Analysis, 25 Years on: A Tribute to Douglas Biber [Studies in Corpus Linguistics 60], Tony Berber-Sardinha & Marcia Veirano-Pinto (eds), 81–105. Amsterdam: John Benjamins. Google Scholar logo with link to Google Scholar
Biber, Douglas. 1988. Variation across Speech and Writing. Cambridge: CUP. Google Scholar logo with link to Google Scholar
. 1992. The multi-dimensional approach to linguistic analyses of genre variation: An overview of methodology and findings. Computers and the Humanities 26(5–6): 331–345. Google Scholar logo with link to Google Scholar
. 1993. Representativeness in corpus design. Literary and Linguistic Computing 8(4): 243–257. Google Scholar logo with link to Google Scholar
Biber, Douglas & Egbert, Jesse. 2016. Register variation on the searchable web: A multi-dimensional analysis. Journal of English Linguistics 44(2): 95–137. Google Scholar logo with link to Google Scholar
Clarke, Isobelle & Grieve, Jack. 2017. Dimensions of abusive language on Twitter. In Proceedings of the First Workshop on Abusive Language Online, Zeerak Waseem, Wendy Hui Kyong, Dirk Hovy & Joel Tetreault (eds), 1–10. Vancouver BC: Association for Computational Linguistics. Google Scholar logo with link to Google Scholar
. 2019. Stylistic variation on the Donald Trump Twitter account: A linguistic analysis of tweets posted between 2009 and 2018. PLoS ONE 14(9). Google Scholar logo with link to Google Scholar
Conrad, Susan & Biber, Douglas. 2001. Variation in English: Multi-dimensional Studies. Eastbourne: Pearson Education.Google Scholar logo with link to Google Scholar
Covington, Michael A. & McFall, Joe D. 2010. Cutting the Gordian Knot: The Moving-Average Type-Token Ratio (MATTR). Journal of Quantitative Linguistics 17(2): 94–100. Google Scholar logo with link to Google Scholar
Eisenstein, Jacob. 2013. What to do about bad language on the internet. In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL), 359–369.Google Scholar logo with link to Google Scholar
Francis, W. Nelson & Kučera, Henry. 1964. A Standard Corpus of Present-Day Edited American English, for Use with Digital Computers (Brown). Providence, RI: Brown University.Google Scholar logo with link to Google Scholar
Hess, Carla W., Sefton, Karem M. & Landry, Richard G. 1986. Sample size and type-token ratios for oral language of preschool children. Journal of Speech and Hearing Research 29: 129–134. Google Scholar logo with link to Google Scholar
Hess, Carla W., Haug, Holly T. & Landry, Richard G. 1989. The reliability of type-token ratios for the oral language of school age children. Journal of Speech and Hearing Research 32: 536–540. Google Scholar logo with link to Google Scholar
Hiltunen, Turo. 2014. Choice of national variety in the English-language Wikipedia. In Texts and Discourses of New Media, Jukka Tyrkkö & Sirpa Leppänen (eds), n.p. Helsinki: VARIENG. <[URL]> (8 June 2020).
Koizumi, Rie & In’nami, Yo. 2012. Effects of text length on lexical diversity measures: Using short texts with less than 200 tokens. System 40(4): 554–564. Google Scholar logo with link to Google Scholar
Kubát, Miroslav & Milička, Jiří. 2013. Vocabulary richness measure in genres. Journal of Quantitative Linguistics 20(4): 339–349. Google Scholar logo with link to Google Scholar
Rosen, Aliza. 2017. Tweeting made easier. Twitter Blog, 7 November 2017, <[URL]> (5 February 2020).
Titak, Ashley & Roberson, Audrey. 2013. Dimensions of web registers: An exploratory multi-dimensional comparison. Corpora 8(2): 239–271. Google Scholar logo with link to Google Scholar
Vitter, Jeffrey Scott. 1985. Random sampling with a reservoir. ACM Transactions on Mathematical Software 11(1): 37–57. Google Scholar logo with link to Google Scholar
Cited by (7)

Cited by seven other publications

Laitinen, Mikko & Paula Rautionaho
2025. Reuse of social media data in corpus linguistics. International Journal of Corpus Linguistics 30:2  pp. 171 ff. DOI logo
Messerli, Thomas C, Daria Dayter, Sven Leuckert, Aatu Liimatta, Hanna Mahler, Axel Bohmann, Gustavo Kozma & Rafaela Tosin
2025. Digital debating cultures: communicative practices on Reddit. Digital Scholarship in the Humanities 40:1  pp. 227 ff. DOI logo
Heaton, Dan, Elena Nichele, Jeremie Clos, Joel E. Fischer & Michal Ptaszynski
2023. “The algorithm will screw you”: Blame, social actors and the 2020 A Level results algorithm on Twitter. PLOS ONE 18:7  pp. e0288662 ff. DOI logo
Clarke, Isobelle
2022. Register and social media. Register Studies 4:2  pp. 133 ff. DOI logo
Liimatta, Aatu
2022. Do registers have different functions for text length?. Register Studies 4:2  pp. 263 ff. DOI logo
Liimatta, Aatu
2023. Register variation across text lengths. International Journal of Corpus Linguistics 28:2  pp. 202 ff. DOI logo
Liimatta, Aatu
2024. Text length and short texts. In Challenges in corpus linguistics [Studies in Corpus Linguistics, 118],  pp. 106 ff. DOI logo

This list is based on CrossRef data as of 1 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.

Mobile Menu Logo with link to supplementary files background Layer 1 prag Twitter_Logo_Blue