In:Corpus Methods for Semantics: Quantitative studies in polysemy and synonymy
Edited by Dylan Glynn and Justyna A. Robinson
[Human Cognitive Processing 43] 2014
► pp. 443–485
Correspondence analysis
Exploring data and identifying patterns
Published online: 6 November 2014
https://doi.org/10.1075/hcp.43.17gly
https://doi.org/10.1075/hcp.43.17gly
Correspondence analysis is an exploratory technique for complex categorical data, typical of corpus-driven research. It identifies patterns of association and disassociation in those data. For instance, it can map the correlations between different uses of a linguistic form and its various social and/or morpho-syntactic contexts. The technique presents its results in the form of a two-dimensional plot, which visualises these relationships in an intuitive manner. These plots offer rich representations of the relations between different facets of complex data. Using R, this chapter explains how the technique works and offers a step-by-step explanation of its application and the interpretation of its results. The technique is also compared to the better-known and comparable cluster analysis.
Keywords: categorical data, cluster analysis, exploratory statistics, R
References (45)
Arppe, A. 2006. Frequency considerations in morphology. Finnish verbs differ, too.
SKY Journal of Linguistics
, 19, 175–189.
Baayen, R.H. (2008).
Analyzing linguistic data: A practical introduction to statistics using R
. Cambridge: Cambridge University Press.
. (2011). languageR: Data sets and functions with “Analyzing Linguistic Data: A practical introduction to statistics”. R package version 1.1. Retrieved from <[URL]>.
De Leeuw, J., & Mair, P. (2009a). Simple and canonical correspondence analysis using the R package anacor.
Journal of Statistical Software
, 31, 1–18. Retrieved from <[URL]>.
. (2009b). Gifi methods for optimal scaling in R: The package homals.
Journal of Statistical Software
, 31, 1–20. Retrieved from <[URL]>.
Delaere, I., Plevoets, K., & De Sutter, G. (Submitted). Measuring text type variation through profile-based correspondence analysis: How far apart are translated and non-translated Dutch?
Target
. International Journal of Translation Studies
.
Divjak, D. (2010).
Structuring the lexicon: A clustered model for near-synonymy
. Berlin & New York: Mouton de Gruyter.
Dray, S., & Dufour, A.-B. (2007). The ade4 package: Implementing the duality diagram for ecologists.
Journal of Statistical Software
, 22, 1–20.
Glynn, D., & Sjölin, M. (2011). Cognitive Linguistic methods for literature: A usage-based approach to metanarrative and metalepsis. In A. Kwiatkowska (Ed.),
Texts and minds: Papers in cognitive poetics and rhetoric
(pp. 85–102). Frankfurt/Main: Peter Lang.
Glynn, D. (2009). Polysemy, syntax, and variation: A usage-based method for Cognitive Semantics. In V. Evans, & S. Pourcel (Eds.),
New directions in Cognitive Linguistics
(pp. 77–106). Amsterdam & Philadelphia: John Benjamins.
. (2010). Synonymy, lexical fields, and grammatical constructions: A study in usage-based Cognitive Semantics. In H.-J. Schmid, & S. Handl (Eds.),
Cognitive foundations of linguistic usage-patterns: Empirical studies
(pp. 89–118). Berlin & New York: Mouton de Gruyter.
. (2014a). The conceptual profile of the lexeme home: A multifactorial diachronic analysis. In J.E. Díaz-Vera (Ed.), Metaphor and metonymy across time and cultures (pp. 265–293). Berlin & New York: Mouton de Gruyter.
. (2014b). The social nature of anger: Multivariate corpus evidence for context effects upon conceptual structure. In I. Novakova, P. Blumenthal, & D. Siepmann (Eds.), Emotions in discourse (pp. 69–82). Frankfurt/Main: Peter Lang.
. (In press). Cognitive socio-semantics: The theoretical and analytical role of context in meaning.
Review of Cognitive Linguistics
.
Greenacre, M., & Blasius, J. (Eds.). (2006).
Multiple correspondence analysis and related methods
. London: Chapman & Hall.
Greenacre, M., & Nenadić, O. (2010). ca: Simple, multiple and joint correspondence analysis. R package version 0.33. Retrieved from <[URL]>.
. (2006). From simple to multiple correspondence analysis. InM. Greenacre, & J. Blasius(Eds.),
Multiple correspondence analysis and related methods
(pp. 41–76). London: Chapman & Hall.
Husson, F., Lê, S., & Pagès, J. (2011).
Exploratory multivariate analysis by example using R
.
London: Chapman & Hall.
Kowalczyk, T., Pleszczynska, E., & Ruland, F. (Eds.). (2004).
Grade models and methods for data analysis
. München: Springer.
Krawczak, K. (2014a). Shame and its near-synonyms in English: A multivariate corpus-driven approach to social emotions. In I. Novakova, P. Blumenthal, & D. Siepmann (Eds.), Emotions in discourse (pp. 84–94). Frankfurt/Main: Peter Lang.
. (2014b). Epistemic stance predicates in English: A quantitative corpus-driven study of subjectivity. In D. Glynn, & M. Sjölin (Eds.), Subjectivity and epistemicity: Corpus, discourse, and literary approaches to stance (pp. 355–386). Lund: Lund University Press.
Krawczak, K., & Glynn, D. (2011). Context and cognition: A corpus-driven approach to parenthetical uses of mental predicates. In K. Kosecki, & J. Badio (Eds.),
Cognitive processes in language
(pp. 87–99). Frankfurt/Main: Peter Lang.
Krawczak, K., & Kokorniak, I. (2012). Subjective construal of think in Polish.
Poznań Studies in Contemporary Linguistics
, 48, 439–472.
Le Roux, B., & Rouanet, H. (2005).
Geometric data analysis: From correspondence analysis to structured data analysis
. London: Kluwer.
Lê, S., & Husson, F. (2008). FactoMineR: An R package for multivariate analysis.
Journal of Statistical Software
, 25, 1–18.
Murtagh, F. (2005).
Correspondence analysis and data coding with R and Java
. London: Chapman & Hall.
Nenadić, O., & Greenacre, M. (2007). Correspondence analysis in R, with two- and three-dimensional graphics: The ca package.
Journal of Statistical Software
, 20. Retrieved from <[URL]>.
Oksanen, J., Blanchet, G., Kindt, R., Legendre, P., O’Hara, R.B., Simpson, G.L., Solymos, P., Henry, M., Stevens, H., & Wagner, H. (2011). vegan: Community ecology package. R package version 1.17-11. Retrieved from <[URL]>.
Oksanen, J. (2006). Multivariate analysis of ecological communities in R: vegan tutorial. Retrieved from <[URL]>.
Pardo, C. (2010). ‘pamctdp’. Retrieved from <[URL]>.
Plevoets, K., Speelman, D., & Geeraerts, D. (2008). The distribution of T/V pronouns in Netherlandic and Belgian Dutch. In K. Schneider, & A. Baron (Eds.),
Variational pragmatics: Regional varieties in pluricentric languages
(pp. 181–209). Amsterdam & Philadelphia: John Benjamins.
Schmidtke-Bode, K. (2009).
Going-to-V and gonna-V in child language: A quantitative approach to constructional development.
Cognitive Linguistics
, 20, 509–53.
Speelman, D., Grondelaers, S., & Geeraerts, D. (2003). Profile-based linguistic uniformity as a generic method for comparing language varieties.
Computers and the Humanities
, 37, 317–337.
Szelid, V., & Geeraerts, D. (2008). Usage-based dialectology: Emotion concepts in the Southern Csango dialect.
Review of Cognitive Linguistics
, 6, 23–49.
Cited by (78)
Cited by 78 other publications
Caruso, Gabriele & Elvira Celardi
Celardi, Elvira & Gabriele Caruso
Fleissner, Fabian
Hint, Helen, Helena Lemendik, Christer Johansson & Djuddah A. J. Leijen
Hiremath, Rahul B., Yogesh Mahajan, Santanu Bhadra, Ravi Sharma & Ardhendu Shekhar Singh
Hoemann, Katie, Yeasle Lee, Èvelyne Dussault, Simon Devylder, Lyle H. Ungar, Dirk Geeraerts & Batja Mesquita
Lahtein-Kürsa, Marju, Marika Padrik, Simona Daniutė, Daiva Kairienė, Anna-Leena Martikainen, Minna Vanhala-Haukijärvi & Marja-Liisa Mailend
Salcedo-Cifuentes, Mercedes, Laura Margarita Bello-Álvarez & Amparo Bermúdez
Song, Yiming & Deliang Wang
Soriano, Cristina & Anna Ogarkova
Wiraszka, Łukasz
WU, XIA & BIN SHAO
Zhou, Jinyou, Lin Wang & Winfred Xuan
莫, 晓涵
Ivanová, Martina
Jin, Junjie & Fuyin Thomas Li
Johansson, Christer & Per Olav Folgerø
Johansson, Christer & Per Olav Folgerø
Wyroślak, Piotr & Dylan Glynn
Béchet, Christophe & Lieselotte Brems
Campana, Ilaria, Ivan Farace, Miriam Paraboschi & Antonella Arcangeli
Dou, Jinmeng & Meichun Liu
González Granado, Nicolás, Patrick Drouin & Aurélie Picton
Liimatta, Aatu
2023. Register variation across text lengths. International Journal of Corpus Linguistics 28:2 ► pp. 202 ff.
Liu, Meili
Pyykönen, Maria
2023. Exploring patterns of lexical variation in the use of epistemic stance markers in written L2 English across task types and levels of proficiency. International Journal of Learner Corpus Research 9:2 ► pp. 215 ff.
SUGAWARA, Yuki & Kazuho KAMBARA
Zhou, Jiangping
Zhou, Jiangping
Chen, Qiaoyun
Chen, Qiaoyun
Dahlgren, Sonja, Alek Keersmaekers & Joanne Stolk
Fang, Lumin
Glynn, Dylan
2022. Emergent categories. In Analogy and Contrast in Language [Human Cognitive Processing, 73], ► pp. 245 ff.
Glynn, Dylan & Avgustina Biryukova
Larsson, Tove
Larsson, Tove & Henrik Kaatari
2022. Extraposition in learner and expert writing. International Journal of Learner Corpus Research ► pp. 33 ff.
Wyroślak, Piotr
Almutairi, Bandar Alhumaidi A.
Almutairi, Bandar Alhumaidi A.
2022. Diachronic changes of least delicate appraisal in parliamentary and congressional language. Functions of Language 29:2 ► pp. 169 ff.
Beaupoil-Hourdel, Pauline & Aliyah Morgenstern
Gaio, Mario, Carmen Ferrajolo, Alessia Zinzi, Consiglia Riccardi, Pasquale Di Filippo, Ludovica Carangelo, Gorizio Pieretti, Francesco Rossi, Giovanni Francesco Nicoletti & Annalisa Capuano
Hartmann, Stefan
Jannusch, Tim, Darren Shannon, Michaele Völler, Finbarr Murphy & Martin Mullins
Podhorodecka, Joanna
Silva, Augusto Soares da
2021. Measuring the impact of (non)figurativity in the cultural
conceptualization of emotions in the two main national varieties of
Portuguese. In Figurative Language - Intersubjectivity and Usage [Figurative Thought and Language, 11], ► pp. 387 ff.
Soares da Silva, Augusto
2020. Exploring the cultural conceptualization of emotions across national language varieties. Review of Cognitive Linguistics 18:1 ► pp. 42 ff.
Tizón-Couto, David & David Lorenz
Du, Jing, Fuyin Thomas Li & Mengmin Xu
2020. Pò(‘break’),qiē(‘cut’) andkāi(‘open’) in Chinese. Review of Cognitive Linguistics 18:1 ► pp. 213 ff.
Flach, Susanne
Larsson, Tove, Marcus Callies, Hilde Hasselgård, Natalia Judith Laso, Sanne van Vuuren, Isabel Verdaguer & Magali Paquot
2020. Adverb placement in EFL academic writing. International Journal of Corpus Linguistics 25:2 ► pp. 156 ff.
Rogos-Hebda, Anna
Zehentner, Eva
2020. Cognitive reality of constructions as a theoretical and methodological challenge in historical
linguistics. Belgian Journal of Linguistics 34 ► pp. 371 ff.
Deshors, Sandra C. & Mark Waltermire
2019. The indicative vs. subjunctive alternation with expressions of possibility in Spanish. International Journal of Corpus Linguistics 24:1 ► pp. 67 ff.
Jakimowicz, Aleksander & Daniel Rzeczkowski
Lefilliâtre, Boris
Shao, Bin, Yingying Cai & Graeme Trousdale
Tahi, Mathias, Caudou Trebissou, Fabienne Ribeyre, Boguinard Sahin Guiraud, Désiré N’ da Pokou & Christian Cilas
Clarke, Isobelle
Krawczak, Karolina
2018. Reconstructing social emotions across languages and cultures. Review of Cognitive Linguistics 16:2 ► pp. 455 ff.
Krawczak, Karolina
2022. Modeling constructional variation. In Analogy and Contrast in Language [Human Cognitive Processing, 73], ► pp. 341 ff.
Silvennoinen, Olli O.
Szymor, Nina
2018. Translation: universals or cognition?. Target. International Journal of Translation Studies 30:1 ► pp. 53 ff.
Debras, Camille
Deshors, Sandra C.
Deshors, Sandra C.
Fagard, Benjamin & Karolina Krawczak
Garraffoni, André R. S., Fabrício C. Alcântara & Hélio H. Checon
Ioannou, Georgios
2017. A corpus-based analysis of the verbpleróoin Ancient Greek. Review of Cognitive Linguistics 15:1 ► pp. 253 ff.
Ioannou, Georgios
Zhang, Zheng-sheng
2016. A multi-dimensional corpus study of mixed compounds in Chinese. In Integrating Chinese Linguistic Research and Language Teaching and Learning [Studies in Chinese Language and Discourse, 7], ► pp. 215 ff.
Dattner, Elitzur
2015. Enabling and allowing in Hebrew. In Causation, Permission, and Transfer [Studies in Language Companion Series, 167], ► pp. 271 ff.
Desagulier, Guillaume
Desagulier, Guillaume
Divjak, Dagmar, Nina Szymor & Anna Socha-Michalik
Krawczak, Karolina & Dylan Glynn
This list is based on CrossRef data as of 10 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
