Article published In: Register and social media
Edited by Isobelle Clarke and Jack Grieve
[Register Studies 4:2] 2022
► pp. 138–170
A text typology of social media
Published online: 27 October 2022
https://doi.org/10.1075/rs.22008.ber
https://doi.org/10.1075/rs.22008.ber
Abstract
This paper introduces an initial text typology of social media posts from a multi-dimensional (MD) perspective.
Text types are “[g]roupings of text that are similar in their linguistic form” ( (1989). A
typology of English
texts. Linguistics, 271, 3–43. : 13). This text typology is based on a new MD analysis of social media messages presented in the paper. The corpus
consists of 60,000 social media messages in English compiled from Facebook, Twitter, Instagram, Reddit, Telegram, and YouTube.
After the texts were cleaned up, the corpus was tagged with the Biber Tagger and post-processed with the Biber Tag Count. Three
dimensions of variation were determined, each representing an underlying parameter of variation. Once the texts were scored on
each of the dimensions, a k-means cluster analysis was carried out, and the optimal number of clusters was determined using the
Cubic Clustering Criterion statistic. A two-way typology was developed based on the dimensional characteristics of each cluster
and on careful qualitative analysis of text samples.
Keywords: social media, text typology, multi-dimensional analysis, variation
Article outline
- 1.Introduction
- 2.Literature review
- 3.Methods
- 4.Dimensions of variation
- 4.1Dimension 1: Formal, prepared, informational communication
- 4.2Dimension 2: Informal, interactive, stance-marked discourse
- 4.3Dimensions 3: Expression of personal attitudes and feelings
- 4.4Text length
- 5.Text types
- 5.1Text type 1: Objective exposition
- 5.2Text type 2: Subjective expression
- 5.3Text type distribution
- 6.Discussion and final remarks
- Acknowledgements
- Note
References
References (43)
Adam, J. M. (2011). A linguística textual – Introdução à análise textual dos discursos [La linguistique textuelle. Introduction à l’analyse textuelle des discours] (M. D. G. Rodrigues, J. G. D. Silva Neto, L. Passeggi, & E. F. Leurquin, Trans.). São Paulo: Cortez.
Berber Sardinha, T. (2014). Comparing
Internet and pre-Internet registers. In T. Berber Sardinha & M. Veirano Pinto (Eds.), Multi-dimensional
analysis, 25 years on: A tribute to Douglas
Biber (pp. 81–107). Amsterdam/Philadelphia: John Benjamins.
(2017). Text
types in Brazilian Portuguese: A multidimensional
perspective. Corpora, 12(3), 483–515.
(2018). Dimensions
of variation across Internet registers. International Journal of Corpus
Linguistics, 23(2), 125–157.
(2022). Corpus
linguistics and the study of social media: a case study using multi-dimensional
analysis. In A. O’Keeffe & M. McCarthy (Eds.), The
Routledge handbook of corpus
linguistics (pp. 656–674). New York: Routledge.
Berber Sardinha, T., Kauffmann, C., & Acunzo, C. M. (2014). Dimensions
of register variation in Brazilian Portuguese. In T. Berber Sardinha & M. Veirano Pinto (Eds.), Multi-dimensional
analysis, 25 years on: A tribute to Douglas
Biber (pp. 35–80). Amsterdam/Philadelphia: John Benjamins.
Berber Sardinha, T., & Shimazumi, M. (2021). A
text typology of argumentative essays based on the new ICLE v.3. Paper presented at
the 11th International Corpus Linguistics Conference 2021, Limerick,
Ireland.
Berber Sardinha, T., & Veirano Pinto, M. (Eds.). (2014). Multi-dimensional
analysis, 25 years on: A tribute to Douglas
Biber. Amsterdam/Philadelphia: John Benjamins.
(Eds.). (2019). Multi-dimensional
analysis: Research methods and current
issues. London: Bloomsbury Academic.
(2021). A
linguistic typology of American television. International Journal of Corpus
Linguistics, 26(1), 127–160.
(1995). Dimensions
of register variation – a cross-linguistic
comparison. Cambridge: Cambridge University Press.
Biber, D., & Kurjian, J. (2007). Towards
a taxonomy of web registers and text types: a multi-dimensional
analysis. In M. Hundt, N. Nesselhauf, & C. Biewer (Eds.), Corpus
linguistics and the
web (pp. 109–132). Amsterdam / New York: Rodopi.
Bronckart, J. P. (1999). Atividades de linguagem, discursos e textos [Language activities,
discourses and texts] (A. R. Machado, Trans.). São Paulo: EDUC.
Charaudeau, P. (2009). Linguagem e discurso: Modos de organização [Langage et Discours –
Eléments de sémiolinguistique] (A. M. S. Correa, Trans.). São Paulo, SP: Contexto.
Clarke, I. (2020). Linguistic
variation across Twitter and Twitter trolling. (PhD
Dissertation). University of Birmigham, Birmingham.
(2022). A
Multi-dimensional analysis of English tweets. Language and
Literature. Advance online publication.
Clarke, I., & Grieve, J. (2019). Stylistic
variation on the Donald Trump Twitter account: A linguistic analysis of tweets posted between 2009 and
2018. PLOS
ONE, 14(9), e0222062.
Egbert, J., & Staples, S. (2019). Doing
multi-dimensional analysis in SPSS, SAS, and R. In T. Berber Sardinha & M. Veirano Pinto (Eds.), Multi-dimensional
analysis: Research methods and current
issues (pp. 125–144). London / New York: Bloomsbury Academic.
Fairchild, C. (2007). Building
the authentic celebrity: The ‘idol’ phenomenon in the attention economy. Popular Music and
Society, 30(3), 355–375.
Friginal, E., & Hardy, J. A. (2014). Conducting
multi-dimensional analysis using SPSS. In T. Berber Sardinha & M. Veirano Pinto (Eds.), Multi-dimensional
analysis, 25 years on: A tribute to Douglas
Biber (pp. 298–316). Amsterdam/Philadelphia: John Benjamins.
Friginal, E., & Hardy, J. (2019). From
factors to dimensions: Interpreting linguistic co-occurrence
patterns. In T. Berber Sardinha & M. Veirano Pinto (Eds.), Multi-dimensional
analysis: Research methods and current
issues (pp. 145–164). London: Bloomsbury Academic.
Friginal, E., Waugh, O., & Titak, A. (2018). Linguistic
variation in Facebook and Twitter posts. In E. Friginal & J. A. Hardy (Eds.), Studies
in corpus-based
sociolinguistics (pp. 342–362). London: Routledge.
Goulart, L., & Wood, M. (2019). Methodological
synthesis of research using multi-dimensional analysis. Journal of Research Design and
Statistics in Linguistics and Communication
Science, 6(2), 107–137.
Gray, B. (2019). Tagging
and counting linguistic features for multi-dimensional
analysis. In T. Berber Sardinha & M. Veirano Pinto (Eds.), Multi-dimensional
analysis: Research methods and current
issues (pp. 43–66). London / New York: Bloomsbury Academic.
Holgado-Tello, F. P., Chacon-Moscoso, S., Barbero-Garcia, I., & Vila-Abad, E. (2010). Polychoric
versus Pearson correlations in exploratory and confirmatory factor analysis of ordinal
variables. Quality &
Quantity, 441, 153–166.
Liimatta, A. (2019). Exploring
register variation on Reddit: A multi-dimensional study of language use on a social media
website. Register
Studies, 1(2), 269–295.
Marwick, A. (2015). Instafame:
Luxury selfies in the attention economy. Public
Culture, 27(1), 137–160.
McCulloch, G. (2019). Because
Internet: Understanding the new rules of language. New York: Riverhead Books.
O’Halloran, K. (2022). Posthumanism
and corpus linguistics. In A. O’Keeffe & M. McCarthy (Eds.), The
Routledge handbook of corpus
linguistics (pp. 675–692). New York: Routledge.
Prina Dutra, D., & Berber Sardinha, T. (2018). A
linguistic typology of sections in research articles: A multi-dimensional perspective. Paper
presented at the Arizona Corpus Linguistics Conference
(AZCL), Flagstaff, AZ, USA.
Rüdiger, S., & Dayter, D. (2020). The
expanding landscape of corpus-based studies of social media
language. In S. Rüdiger & D. Dayter (Eds.), Corpus
approaches to social media (Vol. 98, Studies in Corpus Linguistics, pp.
1–13). Amsterdam, New York: John Benjamins.
Tannen, D. (1982). Oral
and literate strategies in spoken and written
narratives. Language, 58(1), 1–21.
Titak, A., & Roberson, A. (2013). Dimensions
of web registers: An exploratory multi-dimensional
comparison. Corpora, 8(2), 235–260.
Cited by (4)
Cited by four other publications
Erten-Johansson, Selcen & Veronika Laippala
Marques, Carolina Godoi de Faria, Lucia de Almeida Ferrari & Carlos Henrique Kauffmann
This list is based on CrossRef data as of 30 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
