In:Critical Reflections on Data in Second Language Acquisition
Edited by Aarnes Gudmestad and Amanda Edmonds
[Language Learning & Language Teaching 51] 2018
► pp. 63–88
Chapter 3Data analysis and sampling
Methodological issues concerning proficiency in SLA research
Published online: 10 September 2018
https://doi.org/10.1075/lllt.51.04lea
https://doi.org/10.1075/lllt.51.04lea
This chapter addresses the construct of second-language (L2) proficiency as it pertains to L2 data analysis. In L2 studies, a common practice is to group participants into proficiency categories (i.e., groups such as ‘intermediate’ or ‘advanced’; a practice known as dichotomization). Nevertheless, I argue that, for theoretical and empirical reasons, proficiency should be analyzed as a true continuous variable. Theoretically, we conceptualize adult L2 acquisition as a continuous process driven by basic learning mechanisms that may be constrained by underlying principles. Empirically, I present evidence illustrating how creating categorical groups by carving up a continuous dependent measure is statistically inappropriate. Finally, I address the importance of sampling practices and why it is preferable to include participants from a broad proficiency spectrum.
Article outline
- Introduction
- Background
- The construct of L2 proficiency
- The measurement of L2 proficiency in empirical studies
- Issues in data analysis: Dichotomizing proficiency
- Data sampling and proficiency
- Conclusion
Acknowledgements Notes References
References (102)
Abutalebi, J., Cappa, S. F., & Perani, D. (2001). The bilingual brain as revealed by functional neuroimaging. Bilingualism: Language and Cognition, 4(2), 179–190.
Alderson, J. C. (2005). Diagnosing foreign language proficiency: The interface between learning and assessment. London: Continuum.
AlFallay, I. (2004). The role of some selected psychological and personality traits of the rater in the accuracy of self-and peer-assessment. System, 32(3), 407–425.
Altman, D. G., & Royston, P. (2006). The cost of dichotomising continuous variables. BMJ, 332(7549), 1080.
Alvesson, M., & Sandberg, J. (2011). Generating research questions through problematization. Academy of Management Review, 36(2), 247–271.
American Council on the Teaching of Foreign Languages. (2012). ACTFL proficiency guidelines. Retrieved from: <[URL]>
Baayen, R. H. (2008). Analyzing linguistic data: A practical introduction to statistics using R. Cambridge: Cambridge University Press.
Bachman, L. F. (1990). Constructing measures and measuring constructs. In B. Harley, P. Allen, J. Cummins, & M. Swain (Eds.), The development of second language proficiency (pp.26–49). Cambridge: Cambridge University Press.
Bijvoet, E., & Fraurud, K. (2012). Studying high-level (L1–L2) development and use among young people in multilingual Stockholm: The role of perceptions of ambient sociolinguistic variation. Studies in Second Language Acquisition 34, 291–319.
Blanche, P., & Merino, B. J. (1989). Self-assessment of foreign-language skills: Implications for teachers and researchers. Language Learning, 39(3), 313–338.
Bowden, H. W. (2016). Assessing second-language oral proficiency for research: The Spanish elicited imitation task. Studies in Second Language Acquisition, 38(4), 647–675.
Canale, M., & Swain, M. (1980). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics 1, 1–47.
Cattell, R. B. (1960). The multiple abstract variance analysis equations and solutions: for nature-nurture research on continuous variables. Psychological Review, 67(6), 353–372.
Cook, V. (1999). Going beyond the native speaker in language teaching. TESOL Quarterly, 33(2), 185–209.
Council of Europe. (2001). A Common European Framework of Reference for Languages: Learning, Teaching, Assessment. Cambridge: Cambridge University Press.
Cummins, J. (1980). Psychological assessment of immigrant children: Logic or intuition? Journal of Multilingual and Multicultural Development 1, 97–111.
(1984). Bilingualism and special education: Issues in assessment and pedagogy. Clevedon: Multilingual Matters.
(2008). BICS and CALP: Empirical and theoretical status of the distinction. In B. Street & N. H. Hornberger (Eds.), Encyclopedia of language and education, Volume 2: Literacy (2nd ed., pp.–71–83). New York, NY: Springer Science.
Cunnings, I. (2012). An overview of mixed-effects statistical models for second language researchers. Second Language Research, 28(3), 369–382.
Davis, J. M. (2015). Sampling and what it means. In J. D. Brown & C. Coombe (Eds.), The Cambridge guide to research in language teaching and learning (pp.198–205). Cambridge: Cambridge University Press.
de Bot, K., Lowie, W., & Verspoor, M. (2007). A dynamic systems theory approach to second language acquisition. Bilingualism: Language and Cognition, 10(1), 7–21.
DeCat, C., & Serratrice, L. (under review). Predicting language proficiency in bilingual children. Retrieved on June 20, 2017 from <[URL]>
DeCoster, J., Iselin, A. M. R., & Gallucci, M. (2009). A conceptual and empirical examination of justifications for dichotomization. Psychological Methods, 14(4), 349.
DeKeyser, R. M. (2005). What makes learning second-language grammar difficult? A review of issues. Language Learning, 55(1), 1–25.
Dussias, P. E., Kroff, J. R. V., Tamargo, R. E. G., & Gerfen, C. (2013). When gender and looking go hand in hand. Studies in Second Language Acquisition, 35(2), 353–387.
Ellis, N. C. (2002). Frequency effects in language acquisition: A review with implications for theories of implicit and explicit language acquisition. Studies in Second Language Acquisition, 24, 143–188.
Feldt, L. (1961). The use of extreme groups to test for the presence of relationship. Psychometrika, 26(3), 307–316.
Flege, J. E., & Liu, S. (2001). The effect of experience on adult’s acquisition of a second language. Studies in Second Language Acquisition, 23(4), 527–552.
Flege, J. E., Yeni-Komshian, G. H., & Liu, S. (1999). Age constraints on second-language acquisition. Journal of Memory and Language, 41(1), 78–104.
Gabriel, C., & Kireva, E. (2014). Prosodic transfer in learner and contact varieties: Speech rhythm and intonation of Buenos Aires Spanish and L2 Castilian Spanish produced by Italian native speakers. Studies in Second Language Acquisition, 36(2), 257–281.
Gries, S. T. (2013). Statistics for linguistics with R: A practical introduction. Berlin: Walter de Gruyter.
Grosjean, F. (1998). Studying bilinguals: Methodological and conceptual issues. Bilingualism: Language and Cognition, 1(2), 131–149.
Hakuta, K. (1976). A case study of a Japanese child learning English. Language Learning, 26, 321–351.
Hoover, M. L., & Dwivedi, V. D. (1998). Syntactic processing by skilled bilinguals. Language Learning, 48(1), 1–29.
Hopp, H. (2006). Syntactic features and reanalysis in near-native processing. Second Language Research, 22(3), 369–397.
(2013). Grammatical gender in adult L2 acquisition: Relations between lexical and syntactic variability. Second Language Research, 29(1), 33–56.
Hulstijn, J. H. (2011). Language proficiency in native and nonnative speakers: An agenda for research and suggestions for second-language assessment. Language Assessment Quarterly, 8(3), 229–249.
(2012). The construct of language proficiency in the study of bilingualism from a cognitive perspective. Bilingualism: Language and Cognition, 15(2), 422–433.
Jackson, C. (2008). Proficiency level and the interaction of lexical and morphosyntactic information during L2 sentence processing. Language Learning, 58(4), 875–909.
Jegerski, J. (2016). Number attraction effects in near-native Spanish sentence comprehension. Studies in Second Language Acquisition, 38(1), 5–33.
Jiang, N. (2004). Morphological insensitivity in second language processing. Applied Psycholinguistics, 25(4), 603–634.
Just, M. A., Carpenter, P. A., & Woolley, J. D. (1982). Paradigms and processes in reading comprehension. Journal of experimental psychology: General, 111(2), 228–238.
Kramsch, C., & Whiteside, A. (2007). Three fundamental concepts in second language acquisition and their relevance in multilingual contexts. The Modern Language Journal, 91, 907–922.
Kunce, J. T., Cook, D. W., & Miller, D. E. (1975). Random variables and correlational overkill. Educational and Psychological Measurement, 35(3), 529–534.
Larsen-Freeman, D. (2009). Adjusting expectations: The study of complexity, accuracy, and fluency in second language acquisition. Applied Linguistics, 30(4), 579–589.
Larson-Hall, J. (2015). A guide to doing statistics in second language research using SPSS and R. New York, NY: Routledge.
Larson-Hall, J., & Plonsky, L. (2015). Reporting and interpreting quantitative research findings: What gets reported and recommendations for the field. Language Learning, 65(1), 127–159.
Leal Méndez, T., Farmer, T. A., & Slabakova, R. (2014). A same-system view of L2 processing: Evidence from long-distance syntactic dependencies in L2 Spanish. In W. Orman & M. J. Valleau (Eds.), Proceedings from the annual 38th Boston University Conference of Language Development (Vol. 2, pp.266–278). Somerville, MA: Cascadilla Press.
Leal, T., Slabakova, R., & Farmer, T. (2017). The fine-tuning of linguistic expectations over the course of L2 learning. Studies in Second Language Acquisition. .
Leal, T., & Slabakova, R. (in press). The relationship between L2 instruction, exposure, and the L2 acquisition of a syntax-discourse property in L2 Spanish. Submitted to a special issue of Language Teaching Research.
Levshina, N. (2015). How to do linguistics with R: Data exploration and statistical analysis. Amsterdam: John Benjamins.
Lim, J. H., & Christianson, K. (2013). Second language sentence processing in reading for comprehension and translation. Bilingualism: Language and Cognition, 16(3), 518–537.
Lindeman, R. H., Merenda, P. F., & Gold, R. Z. (1980). Introduction to bivariate and multivariate analysis. London: Foresman.
Linck, J. A., & Cunnings, I. (2015). The utility and application of mixed-effects models in second language research. Language Learning, 65(1), 185–207.
Luk, G., & Bialystok, E. (2013). Bilingualism is not a categorical variable: Interaction between language proficiency and usage. Journal of Cognitive Psychology, 25(5), 605–621.
MacCallum, R. C., Zhang, S., Preacher, K. J., & Rucker, D. D. (2002). On the practice of dichotomization of quantitative variables. Psychological Methods, 7, 19–40.
MacIntyre, P. D., Noels, K. A., & Clément, R. (1997). Biases in self-ratings of second language proficiency: The role of language anxiety. Language Learning, 47(2), 265–287.
MacWhinney, B. (2004). A multiple process solution to the logical problem of language acquisition. Journal of Child Language, 31(4), 883–914.
Marian, V., Blumenfeld, H. K., & Kaushanskaya, M. (2007). The Language Experience and Proficiency Questionnaire (LEAP-Q): Assessing language profiles in bilinguals and multilinguals. Journal of Speech, Language, and Hearing Research, 50(4), 940–967.
Montrul, S. A. (2008). Incomplete acquisition in bilingualism: Re-examining the age factor. Amsterdam: John Benjamins.
Mougeon, F., & Rehner, K. (2009). From grade school to university: The variable use of on/nous by university FSL students. Canadian Modern Language Review, 66, 269–298.
Moyer, A. (1999). Ultimate attainment in L2 phonology. Studies in Second Language Acquisition, 21(1), 81–108.
Norris, J. M. (2015). Statistical significance testing in second language research: Basic problems and suggestions for reform. Language Learning, 65(S1), 97–126.
Norris, J. M., & Ortega, L. (2013). Assessing learner knowledge. In S. Gass & A. Mackey (Eds.), The Routledge handbook of second language acquisition (pp.573–589). New York: Routledge.
Oller, J. W. (1978). The language factor in the evaluation of bilingual education. In J. E. Alatis (Ed.), Georgetown University round table on languages and linguistics (pp.410–422). Washington, DC: Georgetown University Press.
Perry, F. (2011). Research in applied linguistics: Becoming a discerning consumer (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates.
Peters, C. C., & Van Voorhis, W. R. (1940). Statistical procedures and their mathematical bases. New York, NY: McGraw-Hill.
Pliatsikas, C., & Marinis, T. (2013). Processing empty categories in a second language: When naturalistic exposure fills the (intermediate) gap. Bilingualism: Language and Cognition, 16(1), 167–182.
(2015). Statistical power, p-values, descriptive statistics, and effect sizes: A “back-to-basics” approach to advancing quantitative methods in L2 research. In L. Plonsky (Ed.), Advancing quantitative methods in second language research (pp.23–45). New York, NY: Routledge.
Rampton, B. (1990). Displacing the ‘native speaker’: Expertise, affiliation and inheritance. ELT Journal, 44(2), 97–101.
Reichle, R., & Birdsong, D. (2014). Processing focus structure in L1 and L2 French. Studies in Second Language Acquisition, 36(3), 535–564.
Rossi, S., Gugler M. F., Friederici, A. D., & Hahne, A. (2006). The impact of proficiency on syntactic second-language processing of German and Italian: Evidence from event-related potentials. Journal of Cognitive Neuroscience 18, 2030–2048.
Rothman, J. (2008). Why all counter-evidence to the Critical Period Hypothesis in second language acquisition is not equal or problematic. Language and Linguistics Compass, 2(6), 1063–1088.
Scarcella, R. (2003). Academic English: A conceptual framework. Los Angeles, CA: University of California Language Minority Research Institute.
Shentu, Y., & Xie, M. (2010). A note on dichotomization of continuous response variable in the presence of contamination and model misspecification. Statistics in Medicine, 29(21), 2200–2214.
Skehan, P. (2009). Modelling second language performance: Integrating complexity, accuracy, fluency, and lexis. Applied Linguistics, 30(4), 510–532.
Skutnabb-Kangas, T., & P. Toukomaa. (1976). Teaching migrant children and learning the language of the host country in the context of the socio-cultural situation of the migrant family. Helsinki: The Finnish National UNESCO.
Slabakova, R. (2015). The effect of construction frequency and native transfer on second language knowledge of the syntax-discourse interface. Applied Psycholinguistics, 36(3), 671–699.
Steinhauer, K., White, E. J., & Drury, J. E. (2009). Temporal dynamics of late second language acquisition: Evidence from event-related brain potentials. Second Language Research, 25(1), 13–41.
Tanner, D., McLaughlin, J., Herschensohn, J., & Osterhout, L. (2013). Individual differences reveal stages of L2 grammatical acquisition: ERP evidence. Bilingualism: Language and Cognition, 16(2), 367–382.
Thomas, M. (1994). Assessment of L2 proficiency in second language acquisition research. Language Learning, 44, 307–336.
(2006). Research synthesis and historiography: The case of assessment in second language proficiency. In J. M. Norris & L. Ortega (Eds.), Synthesizing research on language learning and teaching, (pp.279–298). Amsterdam: John Benjamins.
Tokowicz, N., & MacWhinney, B. (2005). Implicit and explicit measures of sensitivity to violations in second language grammar: An event-related potential investigation. Studies in Second Language Acquisition, 27(2), 173–204.
Treffers-Daller, J., & Silva-Corvalán, C. (Eds.). (2015). Language dominance in bilinguals: Issues of measurement and operationalization. Cambridge: Cambridge University Press.
Tremblay, A. (2011). Proficiency assessment standards in second language acquisition research. Studies in Second Language Acquisition, 33(3), 339–372.
Valdés, G. (2004). Between support and marginalisation: The development of academic language in linguistic minority children. International Journal of Bilingual Education and Bilingualism, 7(2–3), 102–132.
Velicer, W. F., & Fava, J. L. (1998). Effects of variable and subject sampling on factor pattern recovery. Psychological Methods, 3(2), 231–251.
Verhoeven, L., & de Jong, J. H. (Eds.). (1992). The construct of language proficiency: Applications of psychological models to language assessment. Amsterdam: John Benjamins.
Whelan, R. (2008). Effective analysis of reaction time data. The Psychological Record, 58(3), 475–482.
White, L. (2000). Second language acquisition: From initial state to final state. In J. Archibald (Ed.), Second language acquisition and linguistic theory (pp.130–155). Oxford: Blackwell.
Wolfe-Quintero, K., Inagaki, S., & Kim, H. Y. (1998). Second language development in writing: Measures of fluency, accuracy, & complexity. National Foreign Language Resource Center.
Cited by (13)
Cited by 13 other publications
Perez-Cortes, Silvia
Vergara, Daniel & Gilda Socarrás
Fang, Shaohua & Zhiyi Wu
Nadova, Zuzana & María del Pilar García Mayo
2024. Telicity judgments in L2 English by L1 Slovak speakers. Linguistic Approaches to Bilingualism 14:6 ► pp. 775 ff.
Alzahrani, Alaa
Awwad, Anas & Parvaneh Tavakoli
Cho, Jacee
2022. Online processing and offline judgments of L2-English articles. Linguistic Approaches to Bilingualism 12:3 ► pp. 280 ff.
Cho, Jacee
Leal, Tania & Bradley Hoot
Menke, Mandy R. & Paul A. Malovrh
2021. The (limited) contributions of proficiency assessments in
defining advancedness. In Advancedness in Second Language Spanish [Issues in Hispanic and Lusophone Linguistics, 31], ► pp. 17 ff.
Sequeros-Valle, Jose, Bradley Hoot & Jennifer Cabrelli
Slabakova, Roumyana, Tania Leal, Amber Dudley & Micah Stack
This list is based on CrossRef data as of 26 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
