Article published In: Questionable Research Practices in Applied Linguistics
Edited by Luke Plonsky
[Journal of Second Language Studies 8:2] 2025
► pp. 281–312
Conceptualization and frequency of sampling (mal)practices in L2 inferential quantitative research
Towards a potential QRP?
Available under the Creative Commons Attribution-NonCommercial (CC BY-NC) 4.0 license.
For any use beyond this license, please contact the publisher at rights@benjamins.nl.
This article was made Open Access under a CC BY-NC 4.0 license through payment of an APC by Waseda University and funding secured by the corresponding author.
Published online: 18 August 2025
https://doi.org/10.1075/jsls.00049.vit
https://doi.org/10.1075/jsls.00049.vit
Abstract
Poor sampling practices can constitute a questionable research practice when conducting L2 inferential quantitative research. The current study, a methodological synthesis (N = 433 Scopus/Web of Science (WoS) reports: cluster random sampling) of sampling practices, revealed that L2 inferential quantitative researchers rarely employed randomized and/or effect size-driven sampling processes with only eight (1.8%) and ten (2.3%) of the reports being respectively satisfactory. Furthermore, just 33.9% of the reports featured multisite (convenience) samples. In models assessing what predicted multisite sampling, whether the report was ISLA-focused (rs = −.33, p < .001) or single-authored (rs = −.15, p < .001) incurred moderate and weak negative associations. Citation analysis metric values and the Scopus/WoS contrast had no associations. The findings of this study suggest the field’s sampling practices have room to improve and guidance for future improvement is offered.
Article outline
- 1.Introduction
- 2.Literature review
- 2.1Poor sampling practices as a potential QRP
- 2.2Defining quality in SLA quantitative research: Representativeness and sample size planning
- 2.2.1Representativeness
- 2.2.2Effect size-driven sample size planning and justification
- 2.3Potential predictors of L2 sampling quality
- 2.4Presentation of research questions (RQs)
- 3.Methodology
- 3.1Report pool creation
- 3.2Measurement (Variable) construction
- 3.2.1Bibliometric and citation analysis metrics
- 3.2.2ISLA focus and single authorship
- 3.2.3Aspects of sample quality
- 3.3Data analysis
- 4.Results
- 4.1Stage 1 results
- 4.2Stage 2 results
- 4.2.1Journal cluster check
- 4.2.2Correlational data addressing Stage 2’s RQs
- 5.Post-hoc analysis: Exploration of multisite convenience sampling practices in the report pool
- 5.1Phase 1: A four-category conceptualization of SLA multisite sampling practices in the report pool
- 5.2Phase 2: Considering the number of sites of SLA multisite samples in the report pool
- 5.3Phase 3: The statistical processing of sites within multisite SLA research in the report pool
- 5.4Brief discussion of post-hoc analysis
- 6.Discussion
- 6.1Stage 1: The frequency of sampling QRPs in L2 inferential quantitative research
- 6.2Stage 2: Predicting the incidence of multisite samples within L2 quantitative research
- 7.Conclusion: Limitations and summary
- Notes
References
References (69)
Al-Hoorie, A. H., & Vitta, J. P. (2019). The seven sins of L2 research: A review of 30 journals’ statistical quality and their CiteScore, SJR, SNIP, JCR Impact Factors. Language Teaching Research, 23(6), 727–744.
Al-Hoorie, A. H., Oga-Baldwin, W. L. Q., Hiver, P., & Vitta, J. P. (2022). Self-determination mini-theories in second language learning: A systematic review of three decades of research. Language Teaching Research, 29(4), 1603–1638.
Bell, A., Fairbrother, M., & Jones, K. (2019). Fixed and random effects models: making an informed choice. Quality & Quantity, 53(2), 1051–1074.
Berkopec, A. (2007). HyperQuick algorithm for discrete hypergeometric distribution. Journal of Discrete Algorithms, 5(2), 341–347.
Brysbaert, M. (2019). How many participants do we have to include in properly powered experiments? A tutorial of power analysis with reference tables. Journal of Cognition, 2(1), 1–38.
Brysbaert, M., & Stevens, M. (2018). Power analysis and effect size in mixed effects models: A tutorial. Journal of Cognition, 1(1), 1–20.
Del Mar Suárez, M., Gilabert, R., & Moskvina, N. (2021). The mediating role of vocabulary size, working memory, attention and inhibition in early vocabulary learning under different TV genres: An exploratory study. TESOL Journal, 12(4).
Dellinger, J. (2017). Correlation, Spearman. In M. Allen (Ed.), The SAGE encyclopedia of communication research methods (pp. 274–275). Sage.
Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191.
Farsani, M. A., & Babaii, E. (2020). Applied linguistics research in three decades: A methodological synthesis of graduate theses in an EFL context. Quality & Quantity, 541, 1257–1283.
Fazilatfar, A. M., Kasiri, F., & Nowbakht, M. (2020). The comparative effects of planning time and task conditions on the complexity, accuracy and fluency of L2 writing by EFL Learners. Iranian Journal of Language Teaching Research, 8(1), 93–110.
Feinstein, A. R. (1998). P-Values and confidence intervals: two sides of the same unsatisfactory coin. Journal of Clinical Epidemiology, 51(4), 355–360.
Fienberg, S. E., & Tanur, J. M. (1996). Reconsidering the fundamental contributions of Fisher and Neyman on experimentation and sampling. International Statistical Review, 64(3), 237–253.
Folse, K. S. (2006). The effect of type of written exercise on L2 vocabulary retention. TESOL Quarterly, 401, 273–293.
Freund, R. J., Wilson, W. J., & Mohr, D. L. (2010). Statistical methods (3rd Edition). Academic Press: Elsevier.
Fukuta, J., Nishimura, Y., & Tamura, Y. (2023). Pitfalls of production data analysis for investigating L2 cognitive mechanism: An ontological realism perspective. Journal of Second Language Studies, 6(1), 95–118.
Gass, S., Loewen, S., & Plonsky, L. (2021). Coming of age: The past, present, and future of quantitative SLA research. Language Teaching, 54(2), 245–258.
Gelman, A., Hill, J., & Vehtari, A. (2022). Regression and other stories. Cambridge University Press.
Glass, G. V. (1965). A ranking variable analogue of biserial correlation: Implications for short-cut item analysis. Journal of Educational Measurement, 2(1), 91–95.
Harter, R. (2008). Random sampling. In P. Lavrakas (Ed.), Encyclopedia of survey research methods (pp. 683–684). SAGE Publications.
Hattie, J. A. C. (2009). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. Routledge.
Hirosh, Z., & Degani, T. (2021). Novel word learning among bilinguals can be better through the (dominant) first language than through the second language. Language Learning, 71(4), 1044–1084.
Hiver, P., & Al-Hoorie, A. H. (2020). Reexamining the role of vision in second language motivation: A preregistered conceptual replication of You, Dörnyei, and Csizér (2016). Language Learning, 70(1), 48–102,
Hiver, P., Al-Hoorie, A. H., Vitta, J. P., & Wu, J. (2024). Engagement in language learning: A systematic review of 20 years of research methods and definitions. Language Teaching Research, 28(1), 201–230.
Hu, Y., & Plonsky, L. (2021). Statistical assumptions in L2 research: A systematic review. Second Language Research, 37(1), 171–184.
Huensch, A., & Nagle, C. (2021). The effect of speaker proficiency on intelligibility, comprehensibility, and accentedness in L2 Spanish: A conceptual replication and extension of Munro and Derwing (1995a). Language Learning, 71(3), 626–668.
Hung, H. (2017). Design-based research: Redesign of an English language course using a flipped classroom approach. TESOL Quarterly, 51(1), 180–192.
JASP Team (2023). JASP (Version 0.18.0) [Computer software]. [URL]
Jo, C. W. (2021). Short vs. extended adolescent academic writing: A cross-genre analysis of writing skills in written definitions and persuasive essays. Journal of English for Academic Purposes, 531, 101014.
Joy, R., Schulz, H., FitzPatrick, B., & Hancock, S. (2021). English Language Arts performance of Grade 6 students in an intensive French program. Canadian Modern Language Review/ La Revue Canadienne Des Langues Vivantes, 77(1), 23–45.
Jylkkä, J., Soveri, A., Laine, M., & Lehtonen, M. (2020). Assessing bilingual language switching behavior with Ecological Momentary Assessment. Bilingualism: Language and Cognition, 23(2), 309–322.
Kruschke, J. K. (2015). Doing Bayesian data analysis: A tutorial with R and BUGS (2nd ed.). Academic Press.
Larsson, T., Plonsky, L., Sterling, S., Kytö, M., Yaw, K., & Wood, M. (2024). On the frequency, prevalence, and perceived severity of questionable research practices. Research Methods in Applied Linguistics, 2(3), 100064.
Lee, S., Choe, H., Zou, D., & Jeon, J. (2025). Generative AI (GenAI) in the language classroom: A systematic review. Interactive Learning Environments, 1–25.
Lei, S., & Yang, R. (2020). Lexical richness in research articles: Corpus-based comparative study among advanced Chinese learners of English, English native beginner students and experts. Journal of English for Academic Purposes, 471, 100894.
Lindstromberg, S. (2016). Inferential statistics in Language Teaching Research: A review and ways forward. Language Teaching Research, 20(6), 741–768.
(2023). The winner’s curse and related perils of low statistical power − spelled out and illustrated. Research Methods in Applied Linguistics, 2(3), 100059.
Loewen, S., & Hui, B. (2021). Small samples in instructed second language acquisition research. The Modern Language Journal, 105(1), 187–193.
Madya, S., Retnawati, H., Purnawan, A., Putro, N. H. P. S., & Kartianom, K. (2020). The range of TOEFL scores predicted by TOEP. Indonesian Journal of Applied Linguistics, 10(2), 491–501.
Moranski, K., & Ziegler, N. (2021). A case for multisite second language acquisition research: Challenges, risks, and rewards. Language Learning, 71(1), 204–242.
Morgan-Short, K., Marsden, E., Heil, J., Issa II, B. I., Leow, R. P., Mikhaylova, A., Mikołajczak, S., Moreno, N., Slabakova, R. and Szudarski, P. (2018). Multisite replication in second language acquisition research: Attention to form during listening and reading comprehension. Language Learning, 68(2), 392–437.
Newcombe, R. G. (1998). Two-sided confidence intervals for the single proportion: comparison of seven methods. Statistics in Medicine, 17(8), 857–872.
Neyman, J., & Pearson, E. S. (1933). On the problems of the most efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London, 231A, 289–338.
Neyman, J. (1937). Outline of a theory of statistical estimation based on the classical theory of probability. Philosophical Transactions of the Royal Society of London Series a Mathematical and Physical Sciences, 236(767), 333–380.
Nicklin, C., McLean, S., & Vitta, J. P. (2025). Contrasting fixed- and mixed-effects modeling in vocabulary research: Reanalyzing Laufer (2024) and McLean et al. (2020). Language Learning, Advanced Online Publication.
Norouzian, R. (2020). Sample size planning in quantitative L2 research: A pragmatic approach. Studies in Second Language Acquisition, 42(4), 849–870.
Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hróbjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., … Moher, D. (2021). The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ, n71.
Perugini, M., Gallucci, M., & Costantini, G. (2018). A practical primer to power analysis for simple experimental designs. International Review of Social Psychology, 311, 1–20.
Plonsky, L. (2013). Study quality in SLA: An assessment of designs, analyses, and reporting practices in quantitative L2 research. Studies in Second Language Acquisition, 35(4), 655–687.
Plonsky, L., & Gass, S. (2011). Quantitative research methods, study quality, and outcomes: The case of interaction research. Language Learning, 61(2), 325–366.
Plonsky, L., Larsson, T., Sterling, S., Kytö, M., Yaw, K., & Wood, M. (2024). A taxonomy of questionable research practices in quantitative humanities. In P. I. De Costa, A. Rabie-Ahmed, & C. Cinaglia (Eds.), Ethical issues in applied linguistics scholarship. John Benjamins.
Shatz, I. (2024). Assumption-checking rather than (just) testing: The importance of visualization and effect size in statistical diagnostics. Behavior Research Methods. Advance online publication.
Simons, D. J., Shoda, Y., & Lindsay, D. S. (2017). Constraints on Generality (COG): A proposed addition to all empirical papers. Perspectives on Psychological Science, 12(6), 1123–1128.
Sudina, E., & Plonsky, L. (2024). The effects of frequency, duration, and intensity on L2 learning through Duolingo: A natural experiment. Journal of Second Language Studies, 7(1), 1–43.
Vitta, J. P., & Al-Hoorie, A. H. (2021). Measurement and sampling recommendations for L2 flipped learning experiments: A bottom-up methodological synthesis. The Journal of AsiaTEFL, 18(2), 682–692.
Vitta, J. P., Hahn, A., & Nicklin, C. (2023a, March). Exploring the sampling crisis in L2 quantitative research: A predictive model and future directions [Paper presentation]. American Association for Applied Linguistics 2023 Annual Conference, Portland, OR. (paper content downloadable from OSF)
Vitta, J. P., Nicklin, C., & Albright, S. W. (2023b). Academic word difficulty and multidimensional lexical sophistication: An English-for-academic-purposes-focused conceptual replication of Hashimoto and Egbert (2019). Modern Language Journal, 107(1), 373–397.
Vitta, J. P., Nicklin, C., & McLean, S. (2022). Effect size–driven sample-size planning, randomization, and multisite use in L2 instructed vocabulary acquisition experimental samples. Studies in Second Language Acquisition, 44(5), 1424–1448.
Webb, S., & Kagimoto, E. (2009). The effects of vocabulary learning on collocation and meaning. TESOL Quarterly, 43(1), 55–77.
Cited by (1)
Cited by one other publication
Farsani, Mohammad Amini & A. Mehdi Riazi
2025. Exploring questionable research practices in applied linguistics mixed methods research studies. Journal of Second Language Studies
This list is based on CrossRef data as of 26 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
