Article published In: International Journal of Learner Corpus Research
Vol. 3:1 (2017) ► pp.61–94
Quantitative research methods and study quality in learner corpus research
Published online: 6 June 2017
https://doi.org/10.1075/ijlcr.3.1.03paq
https://doi.org/10.1075/ijlcr.3.1.03paq
Abstract
This study aims to provide the first empirical assessment of quantitative research methods and study quality in learner corpus
research. We systematically review quantitative primary studies referenced in the Learner Corpus Bibliography
(LCB), a representative bibliography of learner corpus research maintained by the Learner Corpus Association
which contained 1,276 references when the current study began. Each primary study in the LCB was coded for over fifty features
representing six dimensions: (a) publication type (i.e. conference paper, book chapter, journal article), (b) research focus (e.g.
lexis, grammar), (c) methodological features (e.g. keyword analysis, error analysis, use of reference corpus), (d) statistical
analyses (e.g. X², t-test, regression analysis), and (e) reporting practices (e.g. reliability coefficients,
means). Results point to several systematic strengths as well as many flaws, such as the absence of research questions, incomplete
and inconsistent reporting practices (e.g. means without standard deviations), and lack of statistical literacy (i.e. LCR studies
generally overrely on tests of statistical significance, do not report effect sizes, rarely check or report whether statistical
assumptions have been met, and rarely use multivariate analyses). Improvements over time, however, are clearly noted and there are
signs that, like other related disciplines, learner corpus research is slowly undergoing methodological reform.
Article outline
- 1.Introduction
- 2.Background
- 2.1Quantitative designs and statistical techniques in corpus linguistics
- 2.2Assessing study quality in SLA
- 2.3The present study
- 3.Method
- 3.1Study identification
- 3.2Data collection and coding
- 3.3Analysis
- 4.Results
- 4.1Learner demographics, learning contexts, and study designs
- 4.2Analyses
- 4.3Reporting practices
- 5.Discussion
- 5.1The ‘what’ of learner corpus research
- 5.2The ‘how’ of learner corpus research
- 5.3Recommendations for future research
- 6.Conclusion
- Acknowledgements
- Note
References
References (85)
APA Publications and Communications Board Working Group on Journal Article Reporting Standards. 2008. “Reporting standards for research in psychology”, American Psychologist 63(9), 839–851.
Baroni, M. & Evert, S. 2008. “Statistical methods for corpus exploitation”. In A. Lüdeling & M. Kytö (Eds.), Corpus Linguistics. An International Handbook. Berlin: Mouton de Gruyter, 777–803.
Biber, D. & Reppen, R. 2015a. “Introduction”. In D. Biber & R. Reppen (Eds.), The Cambridge Handbook of Corpus Linguistics. Cambridge: Cambridge University Press, 1–8.
Bley-Vroman, R. 1988. “The fundamental character of foreign language learning”. In W. Rutherford & M. Sharwood Smith (Eds.), Grammar and Second Language Teaching: A Book of Readings. Rowley, MA: Newbury House, 19–30.
Boyd, A., Hana, J., Nicolas, L., Meurers, D., Wisniewski, K., Abel, A., Schöne, K., Štindlová, B. & Vettori, C. 2014. “The MERLIN corpus: Learner language and the CEFR”. Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), European Language Resources Association (ELRA), Reykjavik, May 26–31, 2014.
Brezina, V. & Meyerhoff, M. 2014. “Significant or random? A critical review of sociolinguistic generalisations based on large corpora”, International Journal of Corpus Linguistics 19(1), 1–28.
Carlsen, C. 2012. “Proficiency level – a fuzzy variable in computer learner corpora”, Applied Linguistics 33(2), 161–183.
Dagneaux, E., Denness, S. & Granger, S. 1998. “Computer-aided error analysis”, System: An International Journal of Educational Technology and Applied Linguistics 26(2), 163–174.
DeKeyser, R., Alfi-Shabtay, I. & Ravid, D. 2010. “Cross-linguistic evidence for the nature of age effects in second language acquisition”, Applied Psycholinguistics 311, 413–438.
Derrick, D. J. 2016. “Instrument reporting practices in second language research”, TESOL Quarterly 501, 132–153.
dissertation.laerd.com 2012: online. “Types of quantitative research question”, Laerd Dissertation Online. Available at: [URL] (accessed May 2016).
Durrant, P. 2014. “Corpus frequency and second language learners’ knowledge of collocations: A meta-analysis”, International Journal of Corpus Linguistics 191, 443–477.
Gass, S. 2009. “A historical survey of SLA research”. In T. K. Bhatia & W. C. Ritchie (Eds.), The New Handbook of Second Language Acquisition. Bingley, England: Emerald, 3–27.
Gass, S., Fleck, C., Leder, N., & Svetics, I. 1998. “Ahistoricity revisited. Does SLA have a history?”, Studies in Second Language Acquisition 201, 407–421.
Gelman, A. & Weakliem, D. 2009. “Of beauty, sex, and power: Too little attention has been paid to the statistical challenges in estimating small
effects”, American Scientist 971, 310–316.
Gilquin, G., Granger, S. & Paquot, M. 2007. “Learner corpora: The missing link in EAP pedagogy”, Journal of English for Academic Purposes 6(4), 319–335.
Gilquin, G., De Cock, S. & Granger, S. 2010. The Louvain International Database of Spoken English Interlanguage. Handbook and CD-ROM. Louvain-La-Neuve: Presses universitaires de Louvain.
Godfroid, A. & Spino, L. 2015. “Reconceptualizing reactivity of think-alouds and eye tracking: Absence of evidence is not evidence of
absence”, Language Learning 651, 896–928.
Granger, S. 1996. “From CA to CIA and back: an integrated approach to computerized bilingual and learner corpora”. In K. Aijmer, B. Altenberg, & M. Johansson (Eds.), Languages in Contrast. Text-based Cross-linguistic Studies. Lund: Lund University Press, 37–51.
1997. “Automated retrieval of passives from native and learner corpora: precision and recall”, Journal of English Linguistics 25(4), 365–374.
2004. “Computer learner corpus research: Current status and future prospects”. In U. Connor & T. Upton (Eds.), Applied Corpus Linguistics: A Multidimensional Perspective. Amsterdam and Atlanta: Rodopi, 123–145.
2009. “The contribution of learner corpora to second language acquisition and foreign language teaching: A critical
evaluation”. In K. Aijmer (Ed.), Corpora and Language Teaching. Amsterdam: John Benjamins, 13–32.
2015. “Contrastive Interlanguage Analysis: A reappraisal”, International Journal of Learner Corpus Research 1(1): 7–24.
Granger, S. & Bestgen, Y. forthcoming. “Using collgrams to assess L2 phraseological development: A replication study”. In P. de Haan, R. de Vries, & S. van Vuuren (Eds.), Language, Learners and Levels: Progression and Variation. Louvain-la-Neuve: Presses universitaires de Louvain.
Granger, S., Dagneaux, E., Meunier, F. & Paquot, M. 2009. International Corpus of Learner English. Version 2 (Handbook + CD-Rom). Louvain-la-Neuve: Presses universitaires de Louvain.
Granger, S., Gilquin, G. & Meunier, F. 2015. The Cambridge Handbook of Learner Corpus Research. Cambridge: Cambridge University Press.
Gries S. 2006a. “Some proposals towards more rigorous corpus linguistics”, Zeitschrift für Anglistik und Amerikanistik 54(2), 191–202.
2006b. “Exploring variability within and between corpora: Some methodological considerations”, Corpora 1(2), 109–151.
Gries, S. 2010. “Methodological skills in corpus linguistics: a polemic and some pointers towards quantitative
methods”. In T. Harris & M. Moreno Jaén (Eds.), Corpus Linguistics in Language Teaching. Frankfurt: Peter Lang, 121–146.
2013a. “Statistical tests for the analysis of learner corpus data”. In A. Diaz-Negrillo, N. Ballier, & P. Thompson (Eds.), Automatic Treatment and Analysis of Learner Corpus Data. Amsterdam and Philadelphia: Benjamins.
2013b. Statistics for Linguistics with R. A Practical Introduction (2nd edition). Berlin and Boston: De Gruyter Mouton.
2014. “Quantitative corpus approaches to linguistic analysis: seven or eight levels of resolution and the lessons they
teach us”. In I. Taavitsainen, M. Kytö, C. Claridge, & J. Smith (Eds.), Developments in English: Expanding Electronic Evidence. Cambridge: Cambridge University Press, 29–47.
2015a. “Statistics for learner corpus research”. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge Handbook of Learner Corpus Research. Cambridge: Cambridge University Press, 159–181.
2015b. “Quantitative designs and statistical techniques”. In D. Biber & R. Reppen (Eds.), The Cambridge Handbook of English Corpus Linguistics. Cambridge: Cambridge University Press, 50–71.
2015c. “Some current quantitative problems in corpus linguistics and a sketch of some solutions”, Language and Linguistics 16(1), 93–117.
Gries, S. & Wulff, S. 2013. “The genitive alternation in Chinese and German ESL learners. Towards a multifactorial notion of
context in learner corpus research”, International Journal of Corpus Linguistics 18(3), 327–356.
Hulstijn, J., Alderson, C. & Schroonen, R. 2010. “Developmental stages in second language acquisition and levels of second-language proficiency: Are there links
between them?”. In I. Bartning, M. Martin, & I. Vedder (Eds.), Communicative Proficiency and Linguistic Development: Intersections between SLA and language Testing Research. European Second Language Assocation: EUROSLA Series Monographs 1, 11–22.
Ioannidis J. P. A., Fanelli, D., Dunne, D. D. & Goodman, S. N. 2015. “Meta-research: Evaluation and improvement of research methods and practices”, PLoS Biology 13(10), 1–7.
Kilgarriff, A. 2001. “Comparing corpora”, International Journal of Corpus Linguistics 6(1), 97–133.
2005. “Language is never, ever, ever random”, Corpus Linguistics and Linguistic Theory 1–2, 263–275.
Köhler, R. 2013. “Statistical Comparability: Methodological Caveats”. In S. Sharoff, R. Rapp, P. Zweigenbaum, & P. Fung (Eds.), Building and Using Comparable Corpora. Berlin and Heidelberg: Springer, 77–91.
Larson-Hall, J. & Herrington, R. 2010. “Improving data analysis in second language acquisition by utilizing modern developments in applied
statistics”, Applied Linguistics 31(3), 368–390.
Larson-Hall, J. & Plonsky, L. 2015. “Reporting and interpreting quantitative research findings: What gets reported and recommendations for the
field”, Language Learning 65(S1), 127–159.
Levshina, N. 2015. How to Do Linguistics with R: Data Exploration and Statistical Analysis. Amsterdam and Philadelphia: John Benjamins.
Lipsey, M. W. 2009. “Identifying interesting variables and analysis opportunities”. In H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The Handbook of Research Synthesis (2nd edition). New York: Russell Sage Foundation, 147–158.
Liu, Q. & Brown, D. 2015. “Methodological synthesis of research on the effectiveness of corrective feedback in L2 writing”, Journal of Second Language Writing 301, 66–81.
Mahboob, A., Paltridge, B., Phakiti, A., Wagner, E., Starfield, S., Burns, A., Jones, R. H. & De Costa, P. I. 2016. “TESOL Quarterly Research Guidelines”, TESOL Quarterly 501, 42–65.
Mackey, A. & Gass, S. 2011. Research Methods in Second Language Acquisition: A Practical Guide. Oxford: Wiley-Blackwell.
Mackey, A. & Marsden, E. (Eds.) 2016. Advancing Methodology and Practice: The IRIS Repository of Instruments for Research into Second Languages. New York: Routledge.
Marsden, E., Mackey, A. & Plonsky, L. 2016. “Breadth and depth: The IRIS repository”. In A. Mackey & E. Marsden (Eds.), Advancing Methodology and Practice: The IRIS Repository of Instruments for Research into Second Languages. New York: Routledge, 1–21.
Meunier, F. 2010. “Learner corpora and English language teaching: Checkup time”, Anglistik: International Journal of English Studies 21(1), 209–220.
Myles, F. 2008. “Investigating learner language development with electronic longitudinal corpora: Theoretical and methodological
issues”. In L. Ortega & H. Byrnes (Eds.), The Longitudinal Study of Advanced L2 Capacities. New York and London: Routledge, 58–72.
Norris, J. 2015. “Statistical significance testing in second language research: Basic problems and suggestions for
reform”, Language Learning 65(S1), 97–126.
Norris, J., Plonsky, L., Ross, S. & R. Schoonen. 2015. “Guidelines for reporting quantitative methods and results in primary research”, Language Learning 65(2): 470–476.
2013. “SLA for the 21st century: Disciplinary progress, transdisciplinary relevance, and the bi/multilingual
turn”, Language Learning 63(S1), 1–24.
Ortega, L. & Iberri-Shea, G. 2005. “Longitudinal research in second language acquisition: recent trends and future directions”, Annual Review of Applied Linguistics 251, 26–45.
Ortega, L. & Byrnes, H. 2008. “Theorizing advancedness, setting up the longitudinal research agenda”. In L. Ortega & H. Byrnes (Eds.), The Longitudinal Study of Advanced L2 Capacities. New York: Routledge, 281–300.
Paquot, M. & Bestgen, Y. 2009. “Distinctive words in academic writing: a comparison of three statistical tests for keyword
extraction”. In A. Jucker, D. Schreier & M. Hundt (Eds.), Corpora: Pragmatics and Discourse. Papers from the 29th International Conference on English Language Research on
Computerized Corpora (ICAME 29). Amsterdam: Rodopi, 247–269.
Paquot, M. & Granger, S.. 2012. “Formulaic Language in Learner Corpora”, Annual Review of Applied Linguistics 321, 130–149.
Pendar, N. & Chapelle, C.. 2008. “Investigating the promise of learner corpora: Methodological issues”, CALICO Journal 25(2), 189–206.
Plonsky, L. 2013. “Study quality in SLA: An assessment of designs, analyses, and reporting practices in quantitative L2
research”, Studies in Second Language Acquisition 351, 655–687.
2014. “Study quality in quantitative L2 research (1990–2010): A methodological synthesis and call for
reform”, The Modern Language Journal 98(1), 450–470.
2015a. “Statistical power, p values, descriptive statistics, and effect sizes: a ‘back-to-basics’ approach to advancing
quantitative methods in L2 research”. In L. Plonsky (Ed.), Advancing Quantitative Methods in Second Language Research. New York: Routledge, 23–45.
2015c. “Quantitative considerations for improving replicability in CALL and applied linguistics”, CALICO Journal 321, 232–244.
Plonsky, L. & Derrick, D. J. 2016. “A meta-analysis of reliability coefficients in second language research”, Modern Language Journal 1001, 538–558.
Plonsky, L. & Gass, S. 2011. “Quantitative research methods, study quality, and outcomes: The case of interaction research”, Language Learning 611, 325–366.
Plonsky, L., Egbert, J. & LaFlair, G. T. 2015. “Bootstrapping in applied linguistics: Assessing its potential using shared data”, Applied Linguistics 361, 591–610.
Plonsky, L. & Kim, Y. 2016. “Task-based learner production: A substantive and methodological review”, Annual Review of Applied Linguistics 361, 73–97.
Plonsky, L. & Oswald, F. L. 2015. “Meta-analyzing second language research”. In L. Plonsky (Ed.), Advancing Quantitative Methods in Second Language Research. New York: Routledge, 106–128.
in press. Multiple regression as a flexible alternative to ANOVA in L2 research. Studies in Second Language Acquisition.
Porte, G. (Ed.). 2012. Replication Research in Applied Linguistics. Cambridge: Cambridge University Press.
Present-Thomas, R. L., Weltens, B. & de Jong, J. H. A. L. 2013. “Defining proficiency: A comparative analysis of CEF level classification methods in a written learner
corpus”, Dutch Journal of Applied Linguistics 2(1), 57–76.
Römer, U. 2009. “The inseparability of lexis and grammar. Corpus linguistic perspectives”, Annual Review of Cognitive Linguistics 71, 141–163.
Ross, S. & Mackey, B. 2015. “Bayesian Approaches to Imputation, Hypothesis Testing, and Parameter Estimation”, Language Learning 65 (Supp. 1), 208–227.
Selinker, L. 1972. “Interlanguage”, International Review of Applied Linguistics in Language Teaching 101, 209–231.
Thomas, M. 2006. “Research synthesis and historiography: The case of assessment of second language proficiency”. In J. M. Norris & L. Ortega (Eds.), Synthesizing Research on Language Learning and Teaching. Philadelphia, PA: John Benjamins, 279–298.
Cited by (74)
Cited by 74 other publications
Becker, Laura & Matías Guzmán Naranjo
Fernández-González, Carlos & Mónica Ledo
Jahanbakhsh, Akbar A., Zahra Banitalebi, Jenifer Larson-Hall & Aya Shiiba
Larsson, Tove & Douglas Biber
2025. Encouraging cumulative knowledge building as normal practice in (learner) corpus research. International Journal of Learner Corpus Research 11:1 ► pp. 1 ff.
Larsson, Tove & Tülay Dixon
Wang, Ying, Henrik Kaatari, Tove Larsson, Hongping Xiong & Fei Liu
Wei, Fang
Abe, Mariko, Yuichiro Kobayashi & Yusuke Kondo
Bottini, Raffaella
Granger, Sylviane
2024. From early to future learner corpus research. International Journal of Learner Corpus Research 10:2 ► pp. 247 ff.
Götz, Sandra & Magali Paquot
2024. Ten years of the International Journal of Learner Corpus Research
. International Journal of Learner Corpus Research 10:2 ► pp. 241 ff.
Hasan, Desak Made Oka Purnawati & Herlina
Hashimoto, Brett & Kyra Nelson
Kaatari, Henrik, Ying Wang & Tove Larsson
Papadopoulou, Despina, Nikolaos Amvrazis, Gerakini Douka & Alexandros Tantos
Pérez-Paredes, Pascual & Geraldine Mark
Roquet Pugès, Helena, Noelia Navarro Gil & Florentina Nicolás-Conesa
2024. The impact of adjunct instruction on EFL academic writing at university. Journal of Immersion and Content-Based Language Education 12:2 ► pp. 162 ff.
Sahlender, Moritz, Stefanie Helbert, Inga ten Hagen, Anastasia Knaus & Zarah Weiss
Sarré, Cédric, Cédric Brudermann & Muriel Grosbois
2024. Using learner corpus data for grammatical accuracy development in written productions. International Journal of Learner Corpus Research 10:1 ► pp. 107 ff.
Granger, Sylviane & Marie-Aude Lefer
Hammond, Thomas & Kook-Hee Gil
Kaatari, Henrik, Tove Larsson, Ying Wang, Seda Acikara-Eickhoff & Pia Sundqvist
Paquot, Magali & Nicole Tracy‐Ventura
Pyykönen, Maria
Redmond, Leslie, Denis Foucambert & Lucie Libersan
Diao, Wenhao & Chen Chen
Diaz, Brett A.
Du, Hang
2022. Building a corpus of spoken Chinese interlanguage and some results of preliminary analyses. Chinese as a Second Language (漢語教學研究—美國中文教師學會學報). The journal of the Chinese Language Teachers Association, USA 57:3 ► pp. 238 ff.
Fernández, Julieta
Gao, Jianwu, Quy Huynh Phu Pham & Charlene Polio
Gu, Ling & Naeem Jan
Hiver, Phil, Ali H. Al-Hoorie & Reid Evans
Larsson, Tove, Jesse Egbert & Douglas Biber
Lozano, Cristóbal
Lozano, Cristóbal & Paloma Fernández-Mira
Mahmoodi, Mohammad Hadi & Moslem Yousefi
Park, Hae In, Megan Solon, Marzieh Dehghan‐Chaleshtori & Hessameddin Ghanbar
Pérez-Paredes, Pascual
2022. Review of Tracy-Ventura & Paquot (2021): The Routledge Handbook of Second Language Acquisition and Corpora. International Journal of Learner Corpus Research 8:2 ► pp. 296 ff.
Winter, Tatjana & Elen Le Foll
2022. Testing the pedagogical norm. International Journal of Learner Corpus Research 8:1 ► pp. 31 ff.
Wisniewski, Katrin
Gass, Susan, Shawn Loewen & Luke Plonsky
König, Alexander, Jennifer-Carmen Frey & Egon W. Stemle
Larsson, Tove, Magali Paquot & Douglas Biber
2021. On the importance of register in learner writing. In Corpus-based approaches to register variation [Studies in Corpus Linguistics, 103], ► pp. 235 ff.
Larsson, Tove, Luke Plonsky & Gregory R. Hancock
Li, Mimi
Sudina, Ekaterina
Sönning, Lukas & Valentin Werner
Winter, Bodo & Martine Grice
Amini Farsani, Mohammad & Esmat Babaii
Ballier, Nicolas, Stéphane Canu, Caroline Petitjean, Gilles Gasso, Carlos Balhana, Theodora Alexopoulou & Thomas Gaillat
2020. Machine learning for learner English. International Journal of Learner Corpus Research 6:1 ► pp. 72 ff.
Bell, Philippa, Laura Collins & Emma Marsden
Faya-Cerqueiro, Fátima & Gema Alcaraz-Mármol
Fernández, Julieta & Tracy S. Davis
Gries, Stefan Th. & Magali Paquot
Hsu, Chan-Chia, Richard Hill Davis & Yu-Chi Wang
2020. Chinese learners’ use of concessive connectors in English argumentative writing. Concentric. Studies in Linguistics 46:1 ► pp. 95 ff.
Larsson, Tove, Magali Paquot & Luke Plonsky
2020. Inter-rater reliability in Learner Corpus Research. International Journal of Learner Corpus Research 6:2 ► pp. 237 ff.
Mizumoto, Atsushi, Luke Plonsky & Jesse Egbert
Myles, Florence
Paquot, Magali & Marcus Callies
2020. Promoting methodological expertise, transparency, replication, and cumulative learning. International Journal of Learner Corpus Research 6:2 ► pp. 121 ff.
Plonsky, Luke, Emma Marsden, Dustin Crowther, Susan M Gass & Patti Spinner
Visonà, Mark Winston & Luke Plonsky
Abe, Mariko
2019. Comparing errors across an L2 spoken and written error-tagged Japanese EFL learner corpus. In Learner Corpora and Language Teaching [Studies in Corpus Linguistics, 92], ► pp. 157 ff.
Götz, Sandra
2019. Filled pauses across proficiency levels, L1s and learning context variables. International Journal of Learner Corpus Research 5:2 ► pp. 159 ff.
Teimouri, Yasser, Julia Goetze & Luke Plonsky
Volodina, Elena, Lena Granstedt, Arild Matsson, Beáta Megyesi, Ildikó Pilán, Julia Prentice, Dan Rosén, Lisa Rudebeck, Carl-Johan Schenström, Gunlög Sundberg & Mats Wirén
Gries, Stefan Th.
2018. On over- and underuse in learner corpus research and multifactoriality in corpus linguistics more generally. Journal of Second Language Studies 1:2 ► pp. 277 ff.
Huang, Yan, Akira Murakami, Theodora Alexopoulou & Anna Korhonen
2018. Dependency parsing of learner English. International Journal of Corpus Linguistics 23:1 ► pp. 28 ff.
Paquot, Magali
Paquot, Magali
Phakiti, Aek, Peter De Costa, Luke Plonsky & Sue Starfield
PLONSKY, LUKE & HESSAMEDDIN GHANBAR
Gonulal, Talip, Shawn Loewen & Luke Plonsky
2017. The development of statistical literacy in applied linguistics graduate students. ITL - International Journal of Applied Linguistics 168:1 ► pp. 4 ff.
[no author supplied]
This list is based on CrossRef data as of 12 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
