In:Corpus Methods for Semantics: Quantitative studies in polysemy and synonymy
Edited by Dylan Glynn and Justyna A. Robinson
[Human Cognitive Processing 43] 2014
► pp. 307–341
Techniques and tools
Corpus methods and statistics for semantics
Published online: 6 November 2014
https://doi.org/10.1075/hcp.43.12gly
https://doi.org/10.1075/hcp.43.12gly
The use of corpora in semantic research is a rapidly developing method. However, the range of quantitative techniques employed in the field can make it difficult for the non-specialist to keep abreast with the methodological development. This chapter serves as an introduction to the use of corpus methods in Cognitive Semantic research and as an overview of the relevant statistical techniques and software needed for performing them. The discussion and description are intended for researches in semantics that are interested in adopting quantitative corpus-driven methods. The discussion argues that there are fundamentally two corpus-driven approaches to meaning, one based on observable formal patterns (collocation analysis) and another based on patterns of annotated usage-features of use (feature analysis). The discussion then introduces and explains each of the statistical techniques currently used in the field. Examples of the use of each technique are listed and a summary of the software packages available in R for performing the techniques is included.
References (250)
Afifi, A., May S., & Clark, V.A. (2011).
Practical multivariate analysis
(5th ed.). London: Chapman & Hall.
Arppe, A. (2006). Frequency considerations in morphology: Finnish verbs differ, too.
SKY Journal of Linguistics
, 19, 175–189.
. (2008). Univariate, bivariate and multivariate methods in corpus-based lexicography – A study of synonymy. Unpublished PhD dissertation, University of Helsinki.
Azen, R., & Walker, C. (2011).
Categorical data analysis for the behavioral and social sciences
. New York & Hove: Routledge.
Baayen, R.H. (2008).
Analyzing linguistic data: A practical introduction to statistics using R
. Cambridge: Cambridge University Press.
Baguley, T. (2012).
Loglinear models. Online Supplement 5 to Serious stats: A guide to advanced statistics for the behavioral sciences
. Basingstoke: Palgrave. Available at: [URL].
Balahur, A., & Montoyo, A. (2012). Semantic approaches to fine and coarse-grained feature-based opinion mining. In H. Horacek, E. Métais, R. Muñoz, & M. Wolska (Eds.),
Natural language processing and information systems
(pp. 142–153). Berlin: Springer.
Barnabé, A. (2012). Le schème du chemin en grammaire et sémantique anglaises. Unpublished PhD dissertation, Université Bordeaux 3.
Bates, D. (Forthcoming).
lme4: Mixed-effects modeling with R
. Heidelberg & New York: Springer. Preprints available at: [URL].
Berthele, R. (2010). Investigations into the folk’s mental models of linguistic varieties. In D. Geeraerts, G. Kristiansen, & Y. Peirsman (Eds.),
Advances in cognitive sociolinguistics
(pp. 265–290). Berlin & New York: Mouton de Gruyter.
Biber, D. (2009). A corpus-driven approach to formulaic language in English: Multi-word patterns in speech and writing.
International Journal of Corpus Linguistics
, 14, 275–311.
Biber, D., & Jones, J. (2009). Quantitative methods in Corpus Linguistics. In A. Lüdeling, & M. Kytö (Eds.),
Corpus Linguistics: An international handbook
. Vol. 2. (pp. 1287–1304). Berlin & New York: Mouton de Gruyter.
Borg, I., Groenen, & Mair, P. (2013).
Applied multidimensional scaling
. Heidleberg & New York: Springer.
Borg, I., & Groenen, P. (2005).
Modern multidimensional scaling
(2nd ed.). Heidelberg & New York: Springer.
Bresnan, J., Cueni, A., Nikitina, T., & Baayen, H. (2007). Predicting the dative. In G. Bouma, I. Krämer, & J. Zwarts (Eds.),
Cognitive foundations of interpretation alternation
(pp. 69–94). Amsterdam: Royal Netherlands Academy of Arts and Sciences.
Bybee, J., & Eddington, D. (2006). A usage-based approach to Spanish verbs of ‘becoming’.
Language
, 82, 323–355.
Cadoret, M., Lê, S., & Pagès, J. (2011). Multidimensional scaling versus multiple correspondence analysis when analyzing categorization data. In B. Fichet, D. Piccolo, R. Verde, & M. Vichi (Eds.),
Classification and multivariate analysis for complex data structures
(pp. 301–308). Heidleberg & New York: Springer.
Chaffin, R. (1992). The concept of a semantic relation. In A. Lehrer, & E. Kittay (Eds.),
Frames, fields, and contrasts: New essays in semantic and lexical organisation
(pp. 253–288).
London: Lawrence Erlbaum.
Chessel, D., & Dufour, A.-B. (2013). Analysis of ecological data: Exploratory and Euclidean methods in environmental sciences. Available at: [URL].
Chessel, D., Dufour A.-B, & Thioulouse, Y. (2004) The ade4 package – I: One-table methods.
R News
, 4, 5–10.
Christensen, R. (1997).
Log-linear models and logistic regression
(2nd ed.). Heidleberg & New York: Springer.
. (2012). A tutorial on fitting cumulative link models with the ordinal package. Available at: [URL].
Clancy, S. (2006). The topology of Slavic case: Semantic maps and multidimensional scaling.
Glossos
, 7, 1–28.
Colleman, T. (2009). The semantic range of the Dutch double object construction. A collostructional perspective.
Constructions and Frames
, 1, 190–221.
. (2010). Beyond the dative alternation: The semantics of the Dutch aan-Dative. In D. Glynn, & K. Fischer (Eds.),
Quantitative Cognitive Semantics: Corpus-driven approaches
(pp. 271–304). Berlin & New York: Mouton de Gruyter.
Croft, W., & Poole, K. (2008). Inferring universals from grammatical variation: Multidimensional scaling for typological analysis.
Theoretical Linguistics
, 34, 1–37.
Croissant, Y. (2013). Estimation of multinomial logit models in R: The mlogit packages. Available at: [URL].
Daille, B., Dubreil, E. Monceaux, L., & Vernier, M. (2011). Annotating opinion–evaluation of blogs: The Blogoscopy corpus.
Language Resources and Evaluation
, 45, 409–437.
De Cock, B. (2014a).
A discourse-functional analysis of speech participant profiling in spoken Spanish
. Amsterdam & Philadelphia: John Benjamins.
. (2014b). The discursive effects of Spanish impersonals uno and se
. In D. Glynn, & M. Sjölin (Eds.),
Subjectivity and epistemicity: Corpus, discourse, and literary approaches to stance (pp. 103–120)
. Lund: Lund University Press.
De Leeuw, J., & Mair, P. (2009a). Simple and canonical correspondence analysis using the R package anacor.
Journal of Statistical Software
, 31, 1–18.
. (2009b). Multidimensional scaling using majorization: The R package smacof.
Journal of Statistical Software
, 31, 1–30.
. (2013a). anacor: Simple and canonical correspondence analysis. Available at: [URL].
De Leeuw, J., & Mair, M. (2013b). SMACOF for multidimensional scaling. Available at: [URL].
Deignan, A. (2005).
Metaphor and Corpus Linguistics
. Amsterdam & Philadelphia: John
Benjamins.
Delorge, M. (2009). A diachronic corpus study of the constructional behaviours of reception verbs in Dutch. In B. Lewandowska-Tomaszczyk, & K. Dziwirek (Eds.),
Studies in Cognitive Corpus Linguistics
(pp. 249–272). Frankfurt/Main: Peter Lang.
Desagulier, G. (In press). Le statut de la fréquence dans les Grammaires de Constructions: ‘simple comme bonjour’?
Langages
.
. (Submitted). Quite new methods for a rather old issue: Exploring and visualizing collocation data from the BNC with correspondence analysis.
Deshors, S. (2011). A multifactorial study of the uses of may and can in French-English interlanguage. Unpublished PhD dissertation, University of Sussex.
. (2014). Identifying different types of non-native co-occurrence patterns: A corpus-based approach. In D. Glynn, & M. Sjölin (Eds.),
Subjectivity and epistemicity: Corpus, discourse, and literary approaches to stance (pp. 387–412)
. Lund: Lund University Press.
Diehl, H. (2014). On modal meaning in the uses of quite, rather, pretty and fairly as degree modifiers in British English. Unpublished PhD dissertation, Lund University.
Dirven, R., Goossens, L., Putseys, Y., & Vorlat, E. (1982).
The scene of linguistic action and its perspectivization by speak, talk, say, and tell
. Amsterdam & Philadelphia: John Benjamins.
Divjak, D. (2006). Ways of intending: A corpus-based Cognitive Linguistic approach to near-synonyms in Russian. In St. Th. Gries, & A. Stefanowitsch (Eds.),
Corpora in Cognitive Linguistics: Corpus-based approaches to syntax and lexis
(pp. 19–56). Berlin & New York: Mouton de Gruyter.
. (2010a).
Structuring the lexicon: A clustered model for near-synonymy
. Berlin & New York: Mouton de Gruyter.
. (2010b). Corpus-based evidence for an idiosyncratic aspect-modality relation in Russian. In D. Glynn, & K. Fischer (Eds.),
Quantitative Cognitive Semantics: Corpus-driven approaches
(pp. 305–331). Berlin & New York: Mouton de Gruyter.
Divjak, D., & Gries, St. Th. (2006). Ways of trying in Russian: Clustering behavioral profiles.
Corpus Linguistics and Linguistic Theory
, 2, 23–60.
. (2009). Corpus-based Cognitive Semantics: A contrastive study of phrasal verbs in English and Russian. In B. Lewandowska-Tomaszczyk, & K. Dziwirek (Eds.),
Studies in Cognitive Corpus Linguistics
(pp. 273–296). Frankfurt/Main: Peter Lang.
. (Eds.). (2012).
Frequency effects in language learning and processing
. Berlin & New York: Mouton de Gruyter.
Drenan, R. (2009).
Statistics for archaeologists: A common sense approach
(2nd ed.). Heidelberg & New York: Springer.
Dziwirek, K., & Lewandowska-Tomaszczyk, B. (2011).
Complex emotions and grammatical mismatches: A contrastive corpus-based study
. Berlin & New York: Mouton de Gruyter.
Everitt, B.S., & Hothorn, I. (2010).
A handbook of statistical analyses using R
(2nd ed.). Boca Raton: Taylor & Francis.
Everitt, B.S., Landau, S., Leese, M., & Stahl, D. (2011).
Cluster analysis
(5th ed.). Chichester: John Wiley.
Evert, S. (2009). Corpora and collocations. In A. Lüdeling, & M. Kytö (Eds.),
Corpus Linguistics: An international handbook
(pp. 1212–1249). Berlin & New York: Mouton de Gruyter.
Faraway, J. (2002). Practical regression and anova using R. Available at: [URL].
. (2006).
Extending the linear model with R: Generalized linear, mixed effects and nonparametric regression models
. London: Taylor & Francis.
Field, A., Miles, J., & Field, Z. (2012).
Discovering statistics using R
. London & Thousand Oaks: Sage.
Fillmore, C., & Atkins, B. (1992). Toward a frame-based lexicon: The semantics of risk and its neighbours. In A. Lehrer, & E. Kittay (Eds.),
Frames, fields, and contrasts: New essays in semantic and lexical organisation
(pp. 75–102). London: Lawrence Erlbaum.
Firth, J.R. (1957). A synopsis of linguistic theory 1930–1955. In J.R. Firth (Ed.),
Studies in linguistic analysis
(pp. 1–32). Oxford: Basil Blackwell.
Fischer, K. (2000).
From Cognitive Semantics to Lexical Pragmatics: The functional polysemy of discourse particles
. Berlin & New York: Mouton de Gruyter.
Flores Salgado, E. (2011).
The pragmatics of requests and apologies: Developmental patterns in Mexican students
. Amsterdam & Philadelphia: John Benjamins.
Fontaine, J., Scherer, K., & Soriano, C. (Eds.). (2013).
Components of emotional meaning: A sourcebook
. Oxford: Oxford University Press.
Funke, S., Mair, P., & von Eye, A. (2007). cfa: R package for the analysis of configuration frequencies. Available at: [URL].
Geeraerts, D. (2010). The doctor and the semantician. In D. Glynn, & K. Fischer (Eds.),
Quantitative Cognitive Semantics: Corpus-driven approaches
(pp. 63–78). Berlin & New York: Mouton de Gruyter.
. (2011). Entrenchment, conventionalization, and empirical method. Presented at the 44th Meeting of the Societas Linguistica Europaea, Logroño.
Geeraerts, D., Grondelaers, S., & Bakema, P. (1994).
The structure of lexical variation: Meaning, naming, and context
. Berlin & New York: Mouton de Gruyter.
Geeraerts, D., Grondelaers, S., & Speelman, D. (1999).
Convergentie en Divergentie in de Nederlandse Woordenschat
. Amsterdam: Meertens Instituut.
Geeraerts, D., Kristiansen, G., & Peirsman, Y. (Eds.). (2010).
Advances in cognitive sociolinguistics
. Berlin & New York: Mouton de Gruyter.
Gelman, A., & Hill, J. (2007).
Data analysis using regression and multilevel/hierarchical models
. Cambridge: Cambridge University Press.
Gilquin, G. (2010).
Corpus, cognition and causative constructions
. Amsterdam & Philadelphia: John Benjamins.
Glynn, D. (2007). Mapping meaning: Toward a usage-based methodology in Cognitive Semantics. Unpublished PhD dissertation, University of Leuven.
. (2009). Polysemy, syntax, and variation: A usage-based method for Cognitive Semantics. In V. Evans, & S. Pourcel (Eds.),
New directions in Cognitive Linguistics
(pp. 77–106). Amsterdam & Philadelphia: John Benjamins.
. (2010a). Synonymy, lexical fields, and grammatical constructions: A study in usage-based Cognitive Semantics. In H.-J. Schmid, & S. Handl (Eds.),
Cognitive foundations of linguistic usage-patterns: Empirical studies
(pp. 89–118). Berlin & New York: Mouton de Gruyter.
. (2010b). Testing the hypothesis: Objectivity and verification in usage-based Cognitive Semantics. In D. Glynn, & K. Fischer (Eds.),
Quantitative Cognitive Semantics: Corpus-driven approaches
(pp. 239–270). Berlin & New York: Mouton de Gruyter.
. (2014a). The conceptual profile of the lexeme home: A multifactorial diachronic analysis. In J. E. Díaz-Vera (Ed.), Metaphor and metonymy across time and cultures (pp. 265–293). Berlin & New York: Mouton de Gruyter.
. (2014b). The social nature of anger: Multivariate corpus evidence for context effects upon conceptual structure. In I. Novakova, P. Blumenthal, & D. Siepmann (Eds.),
Emotions in discourse
(pp. 69–82). Frankfurt/Main: Peter Lang.
. (Forthcoming).
Mapping meaning: Corpus methods for Cognitive Semantics
.
Cambridge: Cambridge University Press.
Glynn, D., & Sjölin, M. (2011). Cognitive Linguistic methods for literature: A usage-based approach to metanarrative and metalepsis. In A. Kwiatkowska (Ed.),
Texts and minds: Papers in cognitive poetics and rhetoric
(pp. 85–102). Frankfurt/Main: Peter Lang.
Glynn, D., & Krawczak, K. (Forthcoming). Social cognition, Cognitive Grammar and corpora: A multifactorial approach to epistemic modality.
Cognitive Linguistics
.
Glynn, D., & Fischer, D. (Eds.). (2010).
Quantitative Cognitive Semantics: Corpus-driven approaches
. Berlin & New York: Mouton de Gruyter.
Glynn, D., & Sjölin, M. (Eds.). (2014).
Subjectivity and epistemicity: Corpus, discourse, and literary approaches to stance
. Lund: Lund University Press.
Gries, St. Th. (1999). Particle movement: A cognitive and functional approach.
Cognitive Linguistics
, 10, 105–145.
. (2000). Towards multifactorial analyses of syntactic variation: The case of particle placement. Doctoral dissertation, University of Hamburg.
. (2003).
Multifactorial analysis in Corpus Linguistics: A study of particle placement
. London: Continuum Press.
. (2006). Corpus-based methods and Cognitive Semantics: The many senses of to run
. In St. Th. Gries, & A. Stefanowitsch (Eds.),
Corpora in Cognitive Linguistics: Corpus-based approaches to syntax and lexis
(pp. 57–99). Berlin & New York: Mouton de Gruyter.
. (2009b).
Statistics for Linguistics with R: A practical introduction
(1st ed.). Berlin & New York: Mouton de Gruyter.
. (2010). Behavioral profiles: A fine-grained and quantitative approach in corpus based lexical semantics.
The Mental Lexicon
, 5, 323–346.
. (2013).
Statistics for linguistics with R: A practical introduction
(2nd ed.). Berlin & New York: Mouton de Gruyter.
Gries, St. Th., & Divjak, D. (2009). Behavioral profiles: A corpus-based approach to cognitive semantic analysis. In V. Evans, & S. Pourcel (Eds.),
New directions in Cognitive Linguistics
(pp. 57–75). Amsterdam & Philadelphia: John Benjamins.
Gries, St. Th., & Hilpert, M. (2008). The identification of stages in diachronic data: Variability-based neighbor clustering.
Corpora
, 3, 59–81.
Gries, St. Th., & Stefanowitsch, A. (2004a). Extending collostructional analysis: A corpus-based perspective on ‘alternations’.
International Journal of Corpus Linguistics
, 9, 97–129.
. (2004b). Co-varying collexemes in the into-causative. In M. Achard, & S. Kemmer (Eds.),
Language, culture, and mind
(pp. 225–36). Stanford: CSLI.
Gries, St. Th., & Divjak, D. (Eds.). (2012).
Frequency effects in language representation
. Berlin & New York: Mouton de Gruyter.
Gries, St. Th., & Stefanowitsch, A. (Eds.). (2006).
Corpora in Cognitive Linguistics: Corpus-based approaches to syntax and lexis
. Berlin & New York: Mouton de Gruyter.
Grondelaers, S. (2000). De distributie van niet-anaforisch er buiten de eerste zinsplaats: Sociolexicologische, functionele en psycholinguïstische aspecten van er’s status als presentatief signaal. Doctoral dissertation, University of Leuven.
Grondelaers S., Geeraerts, D., & Speelman, D. (2007). A case for a cognitive Corpus Linguistics. In M. Gonzalez-Marquez, I. Mittleberg, S. Coulson, & M. Spivey (Eds.),
Methods in Cognitive Linguistics
(pp. 149–169). Amsterdam & Philadelphia: John Benjamins.
Grondelaers S., Speelman, D., & Geeraerts, D. (2008). National variation in the use of er “there”: Regional and diachronic constraints on cognitive explanations. In G. Kristiansen, & R. Dirven (Eds.),
Cognitive Sociolinguistics: Language variation, cultural models, social systems
(pp. 153–204). Berlin & New York: Mouton de Gruyter.
Hadfield, J. (2010). MCMC methods for multi-response generalized linear mixed models: The MCMCglmm R package.
Journal of Statistical Software
, 33, 1–22.
Härdle, W., & Simar, L. (2007).
Applied multivariate statistical analysis
. Heidelberg & New York: Springer.
Harrell, F. (2001).
Regression modeling strategies: With Applications to linear models, logistic regression, and survival analysis
. Heidelberg & New York: Springer.
. (2012). Regression modeling strategies. Unpublished manuscript, available at: [URL].
Hennig, C. (2013). Flexible procedures for clustering. Available at: [URL].
Heylen, K. (2005a). A quantitative corpus study of German word order variation. In St. Kepser, & M. Reis (Eds.),
Linguistic evidence: Empirical, theoretical and computational perspectives
(pp.241–264). Berlin & New York: Mouton de Gruyter.
. (2005b). Zur Abfolge (pro)nominaler Satzglieder im Deutschen: Eine korpusbasierte Analyse der relativen Abfolge von nominalem Subjekt und pronominalem Objekt im Mittelfeld, 264. Doctoral dissertation, University of Leuven.
Heylen, K., & Ruette, T. (2013). Degrees of semantic control in measuring aggregated lexical distances. In L. Borin, A. Saxena, A., & T. Rama (Eds.),
Approaches to measuring linguistic differences
(pp. 353–374). Berlin & New York: Mouton de Gruyter.
Heylen, K., Tummers, J., & Geeraerts, D. (2008). Methodological issues in corpus-based Cognitive Linguistics. In G. Kristiansen, & R. Dirven (Eds.),
Cognitive Sociolinguistics: Language variation, cultural models, social systems
(pp. 91–128). Berlin & New York: Mouton de Gruyter.
Hilpert, M. (2008).
Germanic future constructions: A usage-based approach to language change
. Amsterdam & Philadelphia: John Benjamins.
. (2009). The German mit-predicative construction.
Constructions and Frames
, 1, 29–55.
. (2012).
Constructional change in English: Developments in allomorphy, word formation, and syntax
. Cambridge: Cambridge University.
Hoffmann, Th. (2011).
Preposition placement in English: A usage-based approach
. Cambridge: Cambridge University Press.
Hox, J. (2010).
Multilevel analysis: Techniques and applications
(2nd ed.). Hove & New York: Routledge.
Husson, F. Josse, J., Lê, S., & Mazet, J. (2013). Multivariate exploratory data analysis and data mining with R. Available at: [URL].
Husson, F., Lê, S., & Pagès, J. (2011).
Exploratory multivariate analysis by example using R
.
London: Chapman & Hall.
Izenman, A. (2008).
Modern
multivariate statistical techniques: Regression, classification and manifold learning
. Heidelberg & New York: Springer.
Janda, L., & Solovyev, V. (2009). What constructional profiles reveal about synonymy: A case study of the Russian words for sadness and happiness.
Cognitive Linguistics
, 20, 367–393.
Kärkkäinen, E. (2003).
Epistemic stance in English conversation: A description of its interactional functions, with a focus on I think
. Amsterdam & Philadelphia: John Benjamins.
Kaufman, L., & Rousseeuw, P. (2005) [1990].
Finding groups in data: An introduction to cluster analysis
. Hoboken: John Wiley.
Klavan, J. (2012). Evidence in linguistics: Corpus-linguistic and experimental methods for studying grammatical synonymy. Doctoral Dissertation, University of Tartu.
Klavan, J, Kesküla, K., & Ojava, L. (2011). Synonymy in grammar: The Estonian adessive case and the adposition peal ‘on’. In S. Kittilä, K. Västi, & J. Ylikoski (Eds.),
Studies on case, animacy and semantic roles
(pp. 1–19). Amsterdam & Philadelphia: John Benjamins.
Krawczak, K. (2014a). Shame and its near-synonyms in English: A multivariate corpus-driven approach to social emotions. In I. Novakova, P. Blumenthal, & D. Siepmann (Eds.),
Emotions in discourse
(pp. 84–94). Frankfurt/Main: Peter Lang.
. (2014b). Epistemic stance predicates in English: A quantitative corpus-driven study of subjectivity. In D. Glynn, & M. Sjölin (Eds.),
Subjectivity and epistemicity: Corpus, discourse, and literary approaches to stance (pp. 355–386)
. Lund: Lund University Press.
. (In press). Corpus evidence for the cross-cultural structure of social emotions: Shame, embarrassment, and guilt in English and Polish.
Poznań Studies in Contemporary Linguistics
.
Krawczak, K., & Glynn, D. (2011). Context and cognition: A corpus-driven approach to parenthetical uses of mental predicates. In K. Kosecki, & J. Badio (Eds.),
Cognitive processes in language
(pp. 87–99). Frankfurt/Main: Peter Lang.
Krawczak, K., & Kokorniak, I. (2012). Corpus-driven quantitative approach to the construal of Polish ‘think’.
Poznań Studies in Contemporary Linguistics
, 48, 439–472.
Krawczak, K., & Glynn, D. (In press). Operationalising construal: Of/about prepositional profiling for cognitive and communicative predicates. In C.M. Bretones Callejas (Ed.),
Construals in language and thought: What shapes what?
Amsterdam
:
John Benjamins.
Lê, S., Josse, J., & Husson, F. (2008). FactoMineR: An R package for multivariate analysis.
Journal of Statistical Software
, 25, 1–18.
Le Roux, B., & Rouanet, H. (2004).
Geometric data analysis: From correspondence analysis to structured data analysis
. Dordrecht: Kluwer.
Lesnoff, M., & Lancelot, R. (2013). Analysis of overdispersed data. Available at: [URL].
Levshina, N. (2011). A usage-based study of Dutch causative constructions. Doctoral dissertation, University of Leuven.
. (2012). Comparing constructicons: A usage-based analysis of the causative construction with doen in Netherlandic and Belgian Dutch.
Constructions and Frames
, 4, 76–101.
Levshina, N., Geeraerts, D., & Speelman, D. (2013a). Towards a 3D-grammar: Interaction of linguistic and extralinguistic factors in the use of Dutch causative constructions.
Journal of Pragmatics
, 52, 34–48.
. (2013b). Mapping constructional spaces: A contrastive analysis of English and Dutch analytic causatives.
Linguistics
, 51, 825–854.
Lewandowska-Tomaszczyk, B., & Dziwirek, K. (Eds.). (2009).
Studies in Cognitive Corpus Linguistics
. Frankfurt/Main: Peter Lang.
Long, J.S., & Freese, J. (2006) [2001].
Regression models for categorical dependent variables using Stata
. College Station: Stata Press.
Louwerse, M., & Van Peer, W. (2009). How cognitive is cognitive poetics? The interaction between symbolic and embodied cognition. In G. Brône, & J. Vandaele (Eds.),
Cognitive poetics goals, gains and gaps
(pp. 423–444). Berlin & New York: Mouton de Gruyter.
Maechler, M. (2013). Cluster analysis extended. Available at: [URL].
Maindonald, J. (2008). Using R for data analysis and graphics: Introduction, code and commentary. Available at: [URL].
Maindonald, J., & Braun, J. (2010) [2003].
Data analysis and graphics using R
(3rd ed.).
Cambridge: Cambridge University Press.
Marden, J. (2011).
Multivariate statistical analysis:
Old school. Department of Statistics, University of Illinois at Urbana-Champaign. Available at: [URL].
Martin, A.D., Quinn, K.M., & Park, J.H. (2010). Markov chain Monte Carlo (MCMC) package. Available at: [URL].
. (2010).
Logistic regression: From introductory to advanced concepts and applications
. London & Los Angeles: Sage.
Morgenstern, A., Blondel, M., Caët, S., & Boutet, D. (2011). Hearing children’s use of pointing gestures: From pre-linguistic buds to the blossoming of communication skills. Presentation at SALC III, Copenhagen.
Murtagh, F. (2005).
Correspondence analysis and data coding with R and Java
. London: Chapman & Hall.
Myers, D. (1994). Testing for prototypicality: The Chinese morpheme gong
.
Cognitive Linguistics
, 5, 261–280.
Neandić, O., & Greenacre, M. (2007). Correspondence analysis in R, with two- and three-dimensional graphics: The ca Package.
Journal of Statistical Software
, 20, 1–13.
Newman, J., & Rice, S. (2004). Patterns of usage for English sit, stand, and lie: A cognitively-inspired exploration in corpus linguistics.
Cognitive Linguistics
, 15, 351–396.
. (2006). Transitivity schemas of English eat and drink in the BNC. In St. Th. Gries, & A. Stefanowitsch (Eds.),
Corpora in Cognitive Linguistics: Corpus-based approaches to syntax and lexis
. (pp. 225–260). Berlin & New York: Mouton de Gruyter.
Nordmark, H., & Glynn, D. (2013). anxiety between mind and society: A corpus-driven cross-cultural study of conceptual metaphors.
Explorations in English Language and Linguistics
, 1, 107–130.
O’Connell, A. (2006).
Logis
tic regression models for ordinal response variables
. London & Thousand Oaks: Sage.
Orme, J., & Combs-Orme, T. (2009).
Multiple regression with discrete dependent variables
.
Oxford: Oxford University Press.
Peirsman, Y. Heylen, K., & Geeraerts, D. (2010). Applying word space models to sociolinguistics: Religion names before and after 9/11. In D. Geeraerts, G. Kristiansen, & Y. Peirsman (Eds.),
Advances in Cognitive Sociolinguistics
(pp. 111–139). Berlin & New York: Mouton de Gruyter.
Pęzik, P. (2009). Extraction of multiword expressions for corpus-based discourse analysis. In B. Lewandowska-Tomaszczyk, & K. Dziwirek (Eds.),
Studies in Cognitive Corpus Linguistics
(pp. 249–272). Frankfurt/Main: Peter Lang.
Pichler, H. (2013).
The structure of discourse-pragmatic variation
. Amsterdam & Philadelphia: John Benjamins.
Plevoets, K., Speelman, D., & Geeraerts, D. (2008). The distribution of T/V pronouns in Netherlandic and Belgian Dutch. In K. Schneider, & A. Baron (Eds.),
Variational pragmatics: Regional varieties in pluricentric languages
(pp. 181–209). Amsterdam & Philadelphia: John Benjamins.
Pütz, M, Robinson, J.A., & Reif, M. (Eds.) (2012).
Cognitive Sociolinguistics: Social and cultural variation in cognition and language use
. (Special edition of Annual Review of Cognitive Linguistics, 10.)
Ravid, D., & Hanauer, D. (1998). A prototype theory of rhyme: Evidence from Hebrew.
Cognitive Linguistics
, 9, 79–106.
Read, J., & Carroll, J. (2012). Annotating expressions of Appraisal in English.
Language Resources and Evaluation
, 46, 421–447.
Reif, M., Robinson, J.A., & Pütz, M. (Eds.). (2013).
Variation in language and language use: Linguistic, socio-cultural and cognitive perspectives
. Frankfurt/Main: Peter Lang.
Rice, S., Sandra, D., & Vanrespaille, M. (1999). Prepositional semantics and the fragile link between space and yime. In M. Hiraga, C. Sinha, & S. Wilcox (Eds.),
Cultural, typology and psycholinguistic issues in Cognitive Linguistics
(pp. 107–127). Amsterdam & Philadelphia: John Benjamins.
Ripley, B. (2013). Support functions and datasets for Venables and Ripley’s MASS. Available at: [URL].
Robinson, J.A. (2010a).
Awesome insights into semantic variation. In D. Geeraerts,
G. Kristiansen, & Y. Piersman (Eds.),
Advances in Cognitive Sociolinguistics
(pp. 85–109). Berlin & New York: Mouton de Gruyter.
. (2010b). Semantic variation and change in present-day English. Doctoral dissertation, University of Sheffield.
. (2012). A sociolinguistic perspective on semantic change. In K. Allan, & J.A. Robinson (Eds.),
Current methods in Historical Semantics
(pp. 191–231). Berlin & New York: Mouton de Gruyter.
Roever, C., Raabe, N., Luebke, K., Ligges, U., Szepannek, G., & Zentgraf, M. (2013). Classification and visualization. Unpublished manuscript available at: [URL].
Rudzka-Ostyn, B. (1989). Prototypes, schemas, and cross-category correspondences: The case of ask
. In D. Geeraerts (Ed.),
Prospects and problems of prototype theory
(pp. 613–661). Berlin & New York: Mouton de Gruyter.
. (1995). Metaphor, schema, invariance: The case of verbs of answering. In L. Goossens, P. Pauwels, B. Rudzka-Ostyn, A.-M. Simon-Vandenbergen, & J. Vanparys (Eds.),
By word of mouth: Metaphor, metonymy, and linguistic action from a cognitive perspective
(pp. 205–244). Amsterdam & Philadelphia: John Benjamins.
Ruette, T., Ehret, K., & Szmrecsanyi, B. (In press).
Frequency effects in lexical sociolectometry are insubstantial
. In H. Behrens, & S. Pfänder (Eds.),
Again on frequency effects in language
. Berlin & New York: Mouton de Gruyter.
Ruette, T., Geeraerts, D., Peirsman, Y., & Speelman, D. (Forthcoming). Semantic weighting mechanisms in scalable lexical sociolectometry. In B. Szmrecsanyi, & B. Waelchli (Eds.),
Aggregating dialectology and typology: Linguistic variation in text and speech, within and across languages
. Berlin & New York: Mouton de Gruyter.
Sagi, E., Kaufmann, S., & Clark, B. (2011). Tracing semantic change with latent semantic analysis. In K. Allan, & J. Robinson (Eds.),
Current methods in Historical Semantics
(pp. 161–183). Berlin & New York: Mouton de Gruyter.
Sandra, D., & Rice, S. (1995). Network analyses of prepositional meaning: Mirroring whose mind – the linguist’s or the language user’s?
Cognitive Linguistics
, 6, 89–130.
Scheibman, J. (2002).
Point of view and grammar: Structural patterns of subjectivity in American English conversation
. Amsterdam & Philadelphia: John Benjamins.
Scherer, K. (2005). What are emotions? And how can they be measured?
Social Science Information
, 44, 693–727.
Schmid, H.-J. (1993).
Cottage and co., idea, start vs. begin: Die kategorisierung als grundprinzip einer differenzierten
bedeutungsbeschreibung
. Tübingen: Max Niemeyer.
. (2000).
English abstract nouns as conceptual shells: From corpus to cognition
.
Berlin & New York: Mouton de Gruyter.
Schmidtke-Bode, K. (2009).
Going-to-V and gonna-V in child language: A quantitative approach to constructional development.
Cognitive Linguistics
, 20, 509–553.
Schönbrodt, F., Collins, L., & Stemmler, M. (2013). cfa2: Configuration frequency analysis with a design matrix. Available at: [URL].
Schulze, R. (1991). Getting round to (a)round: Towards the description and analysis of a ‘spatial’ predicate. In G. Rauh (Ed.),
Approaches to prepositions
(pp. 253–74).Tubingen: Günter Narr.
Smith, R. (2011).
Multilevel modeling of social problems: A causal perspective
. Heidelberg: Springer.
Speelman, D., & Geeraerts, D. (2010). Causes for causatives: The case of Dutch ‘doen’ and ‘laten’. In T. Sanders, & E. Sweetser (Eds.),
Causal categories in discourse and cognition
(pp. 173–204). Berlin & New York: Mouton de Gruyter.
Speelman, D., Tummers, J., & Geeraerts, D. (2009). Lexical patterning in a Construction Grammar: The effect of lexical co-occurrence patterns on the inflectional variation in Dutch attributive adjectives.
Constructions and Frames
, 1, 87–118.
Stefanowitsch, A. (2010). Empirical Cognitive Semantics: Some thoughts. In D. Glynn, & K. Fischer (Eds.),
Quantitative Cognitive Semantics: Corpus-driven approaches
(pp. 355–380). Berlin & New York: Mouton de Gruyter.
Stefanowitsch, A., & St. Th. Gries. (2003). Collostructions: Investigating the interaction of words and constructions.
International Journal of Corpus Linguistics
, 8, 209–243.
. (2008). Register and constructional meaning: A collostructional case study. In G. Kristiansen, & R. Dirven (Eds.),
Cognitive Sociolinguistics: Language variation, cultural models, social systems
(pp. 129–152). Berlin & New York: Mouton de Gruyter.
Stefanowitsch, A., & Gries, St. Th. (Eds.). (2006).
Corpus-based approaches to metaphor and metonymy
. Berlin & New York: Mouton de Gruyter.
Stevens, J. 2001.
Applied multivariate statistics for the social sciences
(4th ed.). Mahwah: Lawrence Erlbaum.
Strobl, C., Hothorn, T., & Zeileis, A. (2009a). Party on! A new, conditional variable importance measure for random forests available in the party package.
The R Journal
, 1, 14–17.
Strobl, C., Malley, J., & Gerhard T. (2009b). An introduction to recursive partitioning: Rationale, application, and characteristics of classification and regression trees, bagging, and random forests.
Psychological Methods
, 14, 323–348.
Suzuki, R. (2013). Hierarchical clustering with p-values via multiscale bootstrap resampling. Available at: [URL].
Suzuki, R., & Hidetoshi, S. (2006). Pvclust: an R package for assessing the uncertainty in hierarchical clustering.
Bioinformatics
, 22, 1540–1542.
Szelid, V, & Geeraerts, D. (2008). Usage-based dialectology: Emotion concepts in the Southern Csango dialect.
Annual Review of Cognitive Linguistics
, 6, 23–49.
Szmrecsanyi, B. (2003).
Be going to versus will/shall: Does syntax matter?
Journal of English Linguistics
, 31, 295–323.
. (2006).
Morphosyntactic persistence in spoken English: A corpus study at the intersection of Variationist Sociolinguistics, Psycholinguistics, and Discourse Analysis
. Berlin & New York: Mouton de Gruyter.
. (2010). The English genitive alternation in a cognitive sociolinguistic perspective. In D. Geeraerts, G. Kristiansen, & Y. Peirsman (Eds.),
Advances in Cognitive Sociolinguistics
(pp. 141–166). Berlin & New York: Mouton de Gruyter.
. (2013).
Grammatical variation in British English dialects
. Cambridge: Cambridge University Press.
Taboada, M., & Carretero, M. (2012). Contrastive analyses of evaluation in text: Key issues in the design of an annotation system for attitude applicable to consumer reviews in English and Spanish.
Linguistics and the Human Sciences
, 6, 275–295.
Tarling, R. (2009).
Statistical modelling for social researchers: Principles and practice
. London & New York: Routledge.
Therneau, T., Atkinson, EFoundation, M.., & (2013). An introduction to recursive partitioning using the RPART routines. Available at: [URL].
Thompson, L. (2009).
S-PLUS (and R) manual to accompany Agresti’s categorical data analysis (2002)
. Available at: [URL].
Tummers, J., Heylen, K., & Geeraerts, D. (2005). Usage-based approaches in Cognitive Linguistics: A technical state of the art.
Corpus Linguistics and Linguistic Theory
, 1, 225–261.
Valenzuela Manzanares, J., & Rojo López, A.M. (2008). What can language learners tell us about constructions? In S. De Knop, & T. De Rycker (Eds.),
Cognitive approaches to pedagogical grammar? A volume in honour of René Dirven
(pp. 197–230). Berlin & New York: Mouton de Gruyter.
Van Bogaert, J. (2010). A constructional taxonomy of I think and related expressions: Accounting for the variability of complement-taking mental predicates.
English Language and Linguistics
, 14, 399–428.
Venables, W., & Ripley, B. (2002).
Modern applied statistics with S
(4th ed.). Heidelberg: Springer.
Verdonik, D., Rojc, M., & Stabej, M. (2007). Annotating discourse markers in spontaneous speech corpora on an example for the Slovenian language.
Language Resources and Evaluation
, 41, 147–180.
von Eye, A. (2002).
Configural frequency analysis: Methods, models, and applications
. Mahwah: Erlbaum.
von Eye, A., & Mair, P. (2008) A functional approach to configural frequency analysis.
Austrian Journal of Statistics
, 37, 161–173.
von Eye, A, Mair, P., & Mun, E.-Y. (2010).
Advances in configural frequency analysis
. London: Guilford Press.
von Eye, A, & Mun, E.-Y. (2013).
Log-linear modeling: Concepts, interpretation, and application
. Hoboken: John Wiley.
Wiebe, J., Wilson, T., & Cardie, C. (2005). Annotating expressions of opinions and emotions in language.
Language Resources and Evaluation
, 39, 165–210.
Wiechmann, D. (2008). On the computation of collostruction strength: Testing measures of association as expressions of lexical bias.
Corpus Linguistics and Linguistic Theory
, 4, 253–290.
Wong, M. (2009).
Gei constructions in Mandarin Chinese and bei constructions in Cantonese: A corpus-driven contrastive study.
International Journal of Corpus Linguistics
, 14, 60–80.
Wulff, S. (2003). A multifactorial corpus analysis of adjective order in English.
International Journal of Corpus Linguistics
, 8, 245–82.
. (2006).
Go-V vs. go-and-V in English: A case of constructional synonymy? In St. Th. Gries, & A. Stefanowitsch (Eds.),
Corpora in Cognitive Linguistics: Corpus-based approaches to syntax and lexis
(pp. 101–126). Berlin & New York: Mouton de Gruyter.
. (2010). Marrying cognitive-linguistic theory and corpus-based methods: On the compositionality of English V NP-idioms. In D. Glynn, & K. Fischer (Eds.),
Quantitative Cognitive Semantics: Corpus-driven approaches
(pp. 223–238). Berlin & New York: Mouton de Gruyter.
Wulff, S., Stefanowitsch, A., & Gries, St. Th. (2007). Brutal Brits and persuasive Americans: Variety-specific meaning construction in the into-causative. In G. Radden, Köpcke, K.-M., Berg, Th., & Siemund, P. (Eds.),
Aspects of meaning construction
(pp. 265–281). Amsterdam & Philadelphia: John Benjamins.
Zeschel, A. (2010). Exemplars and analogy: Semantic extension in constructional networks. In D. Glynn, & K. Fischer (Eds.),
Quantitative Cognitive Semantics: Corpus-driven approaches
(pp. 201–221). Berlin & New York: Mouton de Gruyter.
Zhao, Y. (2013). R and data mining: Examples and case studies. Unpublished manuscript. Available at: [URL].
Cited by (16)
Cited by 16 other publications
Bębeniec, Daria
Guardamagna, Caterina
2024. A corpus-based analysis of ‘vernacular synonyms’. International Journal of Corpus Linguistics 29:4 ► pp. 562 ff.
Wang, Haitao, Toshiyuki Kanamaru & Ke Li
2024. The polysemy of the Japanese temperature adjective atsui
. Review of Cognitive Linguistics 22:2 ► pp. 476 ff.
González Granado, Nicolás, Patrick Drouin & Aurélie Picton
KAMBARA, Kazuho & Tsukasa YAMANAKA
SCHNEIDER, EDGAR W.
Kokorniak, Iwona
2022. Contrast and analogy in aspectual distinctions of English and Polish. In Analogy and Contrast in Language [Human Cognitive Processing, 73], ► pp. 115 ff.
Hartmann, Stefan
Podhorodecka, Joanna
Zehentner, Eva
Calvo, Elisa & Marián Morón
Kokorniak, Iwona & Alicja Jajko-Siwek
Deshors, Sandra C.
Murrieta-flores, Patricia & Naomi Howell
Riou, Marine
This list is based on CrossRef data as of 10 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
