In:Handbook of Pragmatics: Manual
Edited by Jef Verschueren and Jan-Ola Östman
[Handbook of Pragmatics M2] 2022
► pp. 1738–1762
Regression analysis
Published online: 3 October 2022
https://doi.org/10.1075/hop.m2.reg2
https://doi.org/10.1075/hop.m2.reg2
Article outline
- 1.Introduction
- 2.Building blocks
- 3.Model 0: Modeling the numeric response variable as a function of one numeric predicting variable
- 4.Model 1: Modeling the numeric response variable as a function of one categorical predicting variable
- 5.Model 2: Modeling the numeric response variable as a function of two categorical predicting variables
- 6.Model 3: Modeling the numeric response variable as a function of two categorical predicting variables that interact
- 7.Logistic regression
- 8.Model 4: Modeling the binary response variable as a function of one categorical predicting variable
- 9.Model 5: Modeling the binary response variable as a function of two categorical predicting variables that interact
- 10.Model 6: Modeling the binary response variable as a function of two categorical predicting variables that interact with a third categorical variable
- 11.Independence assumption and mixed effects models
- 12.Model 8: Modeling the binary response variable as a function of two categorical predicting variables that interact with a third categorical variable and two nested random predicting variables
- 13.Statistical significance, model planning, and effect size
- 14.Conclusion
Notes References
References (41)
Baayen, Harald. 2008. Analyzing linguistic data: A practical introduction. Cambridge and New York: Cambridge University Press.
Baayen, R. Harald, Doug J. Davidson and Douglas M. Bates. 2008. “Mixed-effects modeling with crossed random effects for subjects and items”. Journal of Memory and Language 59 (4): 390–412.
Biber, Douglas. 2012. “Register as a predictor of linguistic variation.” Corpus Linguistics and Linguistic Theory 8 (1): 9–37.
. 2014. “Using multi-dimensional analysis to explore cross-linguistic universals of register variation.” Languages in Contrast 14 (1): 7–34.
Cangemi, Francesco, Martina Krüger and Martina Grice. 2015. “Listener-specific perception of speaker-specific production in intonation.” In Individual Differences in Speech Production and Perception, ed. by Susanne Fuchs, Daniel Pape, Caterina Petrone and Pascal Perrier, 123–145. Frankfurt: Peter Lang.
Čermák, František and Alexandr Rosen. 2012. “The case of InterCorp, a multilingual parallel corpus.” International Journal of Corpus Linguistics 17 (3): 411–427.
Council of Europe. 2001. Common European framework of reference for languages: Learning, teaching, assessment. Cambridge: Cambridge University Press.
Ellis, Nick C. 2016. “Salience, Cognition, Language Complexity, and Complex Adaptive Systems.” Studies in second language acquisition 38 (2): 341–351.
Forstmeier, Wolfgang and Holger Schielzeth. 2011. “Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner’s curse.” Behavioral ecology and sociobiology 65 (1): 47–55. Berlin/Heidelberg: Springer.
Gries, Stefan Th. 2015. “The most under-used statistical method in corpus linguistics: Multi-level (and mixed-effects) models.” Corpora 10 (1): 95–125.
2021b. “(Generalized Linear) Mixed-Effects Modeling: A Learner Corpus Example.” Language Learning. Ahead-of-press. (19 March, 2021).
Hakulinen, Auli, Maria Vilkuna, Riitta Korhonen, Vesa Koivisto, Tarja Riitta Heinonen and Irja Alho. 2004. Iso suomen kielioppi. Helsinki: Suomalaisen kirjallisuuden seura. [URL]
Hout, Roeland van. 1994. “Statistics.” Handbook of Pragmatics: Manual. Amsterdam: John Benjamins. Handbook of Pragmatics Online:
Ivaska, Ilmari. 2014. “The Corpus of Advanced Learner Finnish (LAS2): Database and toolkit to study academic learner Finnish.” Apples – Journal of Applied Language Studies 8(3). 21–38.
. 2015. “Longitudinal changes in academic learner Finnish: A key structure analysis.” International Journal of Learner Corpus Research 1 (2): 210–241.
Ivaska, Ilmari, Markku Nikulin and Elisa Reunanen. 2021. The Corpus of Academic Finnish. Turku: University of Turku.
Jaeger, T. Florian. 2008. “Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models.” Journal of memory and language 59 (4): 434–446.
Jaeger, T. Florian, Peter Graff, William Croft and Daniel Pontillo. 2011. “Mixed effect models for genetic and areal dependencies in linguistic typology.” Linguistic Typology 15 (2): 281–320.
Jantunen, Jarmo. 2011. “Kansainvälinen oppijansuomen korpus (ICLFI): typologia, taustamuuttujat ja annotointi.” Lähivõrdlusi. Lähivertailuja 21: 86–105.
Johnson, Daniel Ezra. 2009. “Getting off the GoldVarb Standard: Introducing Rbrul for Mixed-Effects Variable Rule Analysis.” Language and Linguistics Compass 3 (1): 359–383.
Kenny, David A. and Charles M. Judd. 1986. “Consequences of violating the independence assumption in analysis of variance.” Psychological Bulletin 99 (3): 422–431.
Klavan, Jane and Dagmar Divjak. 2016. “The cognitive plausibility of statistical classification models: Comparing textual and behavioral evidence.” Folia linguistica 50 (2): 355–384.
Larsson, Tove, Luke Plonsky and Gregory R. Hancock. 2020. “On the benefits of structural equation modeling for corpus linguists.” Corpus Linguistics and Linguistic Theory Ahead-of-print. .
Mauranen, Anna. 2000. “Strange strings in translated language: A study on corpora.” In Intercultural Faultlines: Research Models in Translation Studies, ed. by Maeve Olohan, 119–141. Manchester: St Jerome Publishing.
Mundry, Roger and Charles L. Nunn. 2009. Stepwise Model Fitting and Statistical Inference: Turning Noise into Signal Pollution. The American naturalist 173 (1): 119–123.
Norouzian, Reza, Michael de Miranda and Luke Plonsky. 2018. “The Bayesian Revolution in Second Language Research: An Applied Approach.” Language Learning 68 (4): 1032–1075.
Norris, John M. 2015. “Statistical Significance Testing in Second Language Research: Basic Problems and Suggestions for Reform.” Language Learning 65 (S1): 97–126.
Pallaskallio, Ritva. 2003. “Uutisaika. Finiittiverbin aikamuodoista katastrofiuutisissa 1892–1994.” Virittäjä 107 (1): 27–45. [URL] (15 April, 2021).
Plonsky, Luke and Frederick L. Oswald. 2017. “Multiple regression as a flexible alternative to anova in L2 research.” Studies in Second Language Acquisition 39 (3): 579–592.
R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. [URL]
Roettger, Timo B. 2019. “Researcher degrees of freedom in phonetic research.” Laboratory phonology 10 (1): 1–27.
Röthlisberger, Melanie, Jason Grafmiller and Benedikt Szmrecsanyi. 2017. “Cognitive indigenization effects in the English dative alternation.” Cognitive Linguistics 28(4): 673–710.
Serlin, Ronald C. and Joel R. Levin. 1985. “Teaching How to Derive Directly Interpretable Coding Schemes for Multiple Regression Analysis.” Journal of Educational Statistics 10 (3): 223–238.
Simmons, Joseph P., Leif D. Nelson and Uri Simonsohn. 2011. “False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant.” Psychological science 22 (11): 1359–1366.
Szmrecsanyi, Benedikt. 2019. “Register in variationist linguistics.” Register Studies 1 (1): 76–99.
Tagliamonte, Sali A. and R. Harald Baayen. 2012. “Models, forests and trees of York English: Was/were variation as a case study for statistical practice.” Language Variation and Change 24: 135–178.
Whittingham, Mark J., Philip A. Stephens, Richard B. Bradbury and Robert P. Freckleton. 2006. “Why Do We Still Use Stepwise Modelling in Ecology and Behaviour?” The Journal of Animal Ecology 75 (5): 1182–1189.
