In:New Frontiers and Connections in Second Language Acquisition: Selected Proceedings of the 17th Generative Approaches to Second Language Acquisition (GASLA-17) Conference
Edited by Tania Ionin and Silvina Montrul
[Language Acquisition and Language Disorders 71] 2026
► pp. 360–383
Big educational language learning data (BELL)
Any good for Generative SLA research?
This content is being prepared for publication; it may be subject to changes.
Abstract
I consider the potential research value of Big
Educational Language Learning (BELL) data, that is, data generated
in teaching and assessment environments as part of the everyday
operations of relevant organizations and institutions. Online
language learning and assessment has created unprecedented
opportunities for the collection of large and diverse samples of
learner data across proficiency, for a variety of tasks and from
learners with variable linguistic and educational backgrounds. I
argue that, thanks to their size and diverse samples, BELL data can
support multi-factor empirical investigations that are not easily
feasible through lab-based or small-scale field studies.
Importantly, they enable the investigation of several abstract
features in a single design, contributing essential empirical
evidence for feature-based hypotheses and accounts of L2
acquisition, complementing experimental investigations.
Keywords: learner corpora, morphosyntax, big data
Article outline
- 1.Introduction
- 2.BELL: Big Educational Language Learning Data
- 3.What can SLA researchers learn from BELL data?
- 3.1Typological effects on L2/L3 outcomes and their interaction with age, exposure, and educational background
- 3.2Considering established generalizations: L1 influence and natural order in the acquisition of L2 English morphemes
- 3.3Extending research designs
- 3.4Overcoming data sparsity
- 3.5Reproducibility of results and computational modeling
- 4.Can BELL data support hypothesis driven research?
- 4.1Usage-based approaches
- 4.2Evaluating the predictions of alternative hypotheses: Derivational complexity vs. L1-L2 structural overlap in L2 English genitive structures
- 4.3Distinguishing parameter-based vs. item-based L1-L2 similarity effects: The case of the L2 English articles
- 5.The value of multifactorial designs for GenSLA research:
The acquisition of L2 functional morphology- 5.1The morphological bottleneck of L2 acquisition: Theoretical views and empirical challenges.
- 5.2Multifactorial designs: The case of the L2 English articles
- 5.3Identifying opportunity for learning
- 5.4Comparing interactions and effects
- 5.5Extending existing hypotheses and formulating new ones
- 6.Investigating theoretical hypotheses
through computational modelling - 7.An empirical bridge between SLA research
and classroom learning and teaching - 8.Conclusion
Notes References
References (57)
Aikhenvald, A. Y., & Dixon, R. M. W. (2017). The
Cambridge handbook of linguistic
typology. Cambridge University Press.
Alexopoulou, T., & Folli, R. (2019). Topic
strategies and the internal structure of nominal arguments
in Greek and
Italian. Linguistic
Inquiry, 50(3), 439–486.
Alexopoulou, T., Geertzen, J., Korhonen, A., & Meurers, D. (2015). Exploring
big educational learner corpora for SLA research:
Perspectives on relative
clauses. International
Journal of Learner Corpus
Research, 1(1), 96–129.
Alexopoulou, T., Meurers, D., & Murakami, A. (2022). Big
data in
SLA. In M. González-Lloret & Nicole Ziegler (Eds.), The
Routledge handbook of second language acquisition and
technology (pp. 92–106). Routledge.
Algie, J. (2023). L1
transfer in the L2 acquisition and processing of the English
genitive alternation: Combining learner corpus and
psycholinguistic
methodologies (Unpublished
doctoral
dissertation). University of Cambridge.
Cambridge-Learner-Corpus. (2009). Cambridge
ESOL and Cambridge University
Press. [URL]
Chen, X., Alexopoulou, T., & Tsimpli, I. M. (2020). Automatic
extraction of subordinate clauses and its application in
second language acquisition
research. Behavior Research
Methods, 53(2), 803–817.
Cho, J., & Slabakova, R. (2014). Interpreting
definiteness in a second language without articles: The case
of L2 Russian. Second
Language
Research 30(2), 159–190.
DeKeyser, R. M. (2005). What
makes learning second language grammar difficult? A review
of issues. Language
Learning, 55(S1), 1–25.
Derkach, K., & Alexopoulou, T. (2023). Definite
and indefinite article accuracy in learner English: A
multifactorial
analysis. Studies in Second
Language
Acquisition, 46(3),710–740.
Dryer, M. S., & Haspelmath, M. (2013). The
world atlas of language structures
online. Max Planck Institute for Evolutionary Anthropology. [URL]
Dulay, H. C. & Burt, M. K. (1974). Natural
sequencies in child second language
acquisition. Language
Learning, 24, 37–53.
Geertzen, J., Alexopoulou, T., Baker, R., Hendriks, H., Jiang, S., & Korhonen, A. (2013). The
EF Cambridge Open Language Database (EFCAMDAT): User manual,
Part I:
Writtings. University of Cambridge. [URL]
Geertzen, J., Alexopoulou, T., & Korhonen, A. (2013). Automatic
linguistic annotation of large scale L2 databases: The
EF-Cambridge Open Language Database
(EFCAMDAT). In R. T. Miller, K. I. Martin, C. M. Eddington, A. Henery, N. Marcos Miguel, A. M. Tseng, A. Tuninetti, & D. Walter (Eds.), Proceedings
of the 31st Second Language Research Forum (SLRF), Carnegie
Mellon, 240–254. Cascadilla Proceedings Project.
Hawkins, R. R. D., & Chan, C. (1997). The
partial availability of universal grammar in second language
acquisition: The Failed Functional Features
Hypothesis. Second Language
Research, 13, 187–226.
Hawkins, R. R. D. (2019). How
second languages are learned: An
introduction. Cambridge University Press.
Huang, Y., Murakami, A., Alexopoulou, T., & Korhonen, A. (2018). Dependency
parsing of learner
English. International
Journal of Corpus
Linguistics, 23(1), 28–54.
(2021). Subcategorization
frame identification for learner
English. International
Journal of Corpus
Linguistics, 26(2), 187–218.
Imai, M. & Gentner, D. (1997). A
cross-linguistic study of early word meaning: Universal
ontology and linguistic
influence. Cognition, 62(2), 169–200.
Ionin, T. (2013). Morphosyntax. In J. Herschensohn & M. Young-Scholten, The
Cambridge handbook of second language
acquisition (pp. 505–528). Cambridge University Press.
(2003). Article semantics in second language acquisition. PhD thesis. Massachusetts Institute of Technology.
(2025). Sensitivity to
missing plural marking in L2 English: A role for
cross-linguistic
influence? In T. Alexopoulou & K. Gil, Linguistic
distance in L2 and L3 acquisition of morphosyntax. Second
Language Research.
Ionin, T., & Montrul, S. (2023). Second
language acquisition: Introducing intervention
research. Cambridge University Press.
Ionin, T. & Montrul, S. (2010). The
role of L1 transfer in the interpretation of articles with
definite plurals in L2
English. Language
Learning, 60(4), 877–925.
Ionin, T., Ko, H., & Wexler, K. (2004). Article
semantics in L2 acquisition: The role of
specificity. Language
Acquisition, 12, 3–69.
Jarvis, S. (2002). Topic
continuity in L2 English article
use. Studies in Second
Language
Acquisition, 24, 387–418.
Lardiere, D. (1998). Dissociating
syntax from morphology in a divergent L2 end-state
grammar. Second Language
Research, 14(4), 359–375.
(2009). Some
thoughts on the contrastive analysis of features in second
language acquisition. Second
Language
Research, 25(2), 173–227.
Li, P., & Xu, Q. (2023). Computational
modelling of bilingual language learning: Current models and
future directions. Language
Learning, 73(S2), 17–64.
Longobardi, G. (1994). The
structure of DPs: Some principles, parameters and
problems. Linguistic
Inquiry, 25, 609–665.
Murakami, A. (2016). Modelling
systematicity and individuality in nonlinear second language
development: The case of English grammatical
morphemes. Language
Learning, 66(4), 834–871.
(2025). Towards
more appropriate modelling of linguistic complexity
measures: Beyond traditional regression
models. Research Methods in
Applied
Linguistics, 4(1), 100182.
Murakami, A., & Alexopoulou, T. (2016). L1
influence on the acquisition order of English grammatical
morphemes: A learner corpus
study. Studies in Second
Language
Acquisition, 38(3), 365–401.
Murakami, A., & Ellis, N. C. (2022). Effects
of availability, contingency, and formulaicity on the
accuracy of English grammatical morphemes in second language
writing. Language
Learning, 72(4), 899–940.
Öksüz, D., Alexopoulou, T., Derkach, K., & Tsimpli, I. M. (2025). The influence of L1
typology on the acquisition of the L2 English article; a
large-scale corpus
study. In T. Alexopoulou & K. Gil (Eds.), Linguistic
distance in L2 and L3 acquisition of morphosyntax. Second
Language Research.
Parodi, T., Schwartz, B. D., & Clahsen, H. (2004). On
the L2 acquisition of the morphosyntax of German
nominals. Second Language
Research, 42(3), 669–705.
Prévost, P., & White, L. (2000). Missing
surface inflection or impairment in second language
acquisition? evidence from tense and
agreement. Second Language
Research, 16(2), 103–133.
Rothman, J., & Slabakova, R. (2017). The
state of the science in generative SLA and its place in
modern second language
studies. Studies in Second
Language Acquisition, 417–442.
Römer, U., & Berger, C. (2019). Observing
the emergence of constructional knowledge: Verb patterns in
German and Spanish learners of English at different
proficiency levels. Studies
in Second Language
Acquisition, 41(5), 1089–1110.
Schepens, J. J., van Hout, R., & Jaeger, T. F. (2020). Big
data suggest strong constraints of linguistic similarity on
adult language
learning. Cognition, 194, 104056.
Schepens, J. J., van der Slik, F., & van Hout, R. (2016). L1
and L2 distance effects in learning L3
Dutch. Language
Learning, 66(1), 224–256.
Schepens, J. J., van Hout, R., & van der Slik, F. (2022). Linguistic
dissimilarity increases age-related decline in adult
language learning. Studies in
Second Language
Acquisition, 45, 167–188.
Schwartz, B. D., & Sprouse, R. A. (1996). L2
cognitive states and the full transfer/full access
model. Second Language
Research, 12, 34–66.
Shatz, I. (2020). Refining
and modifying the EFCAMDAT: Lessons from creating a new
corpus from an existing large-scale English learner language
database. International
Journal of Learner Corpus
Research, 6(2), 220–236.
(2017). The
scalpel model of third language
acquisition. The
International Journal of
Bilingualism, 21(6), 651–665.
Snape, N. (2008). Resetting
the nominal mapping parameter in L2 English: Definite
article use and the count-mass
distinction. Bilingualism:
Language and
Cognition, 11(1), 63–79.
Trenkic, D. (2002). Form
meaning connections in the acquisition of English
articles. Eurosla
Yearbook, 2, 115–133.
(2008). The
representation of English articles in second language
grammars: Determiners or
adjectives? Bilingualism:
Language and
Cognition, 11, 1–18.
Tsimpli, I. M., & Dimitrakopoulou, M. (2007). The
Interpretability Hypothesis: Evidence from wh-interrogatives
in second language
acquisition. Second Language
Research, 23, 215–242.
Van der Slik, F., van Hout, R., & Schepens, J. (2019). The
role of morphological complexity in predicting the
learnability of an additional language: The case of La
(additional language)
Dutch. Second Language
Research, 35(1), 47–70.
Westergaard, M. (2021). Microvariation
in multilingual situations: The importance of
property-by-property
acquisition. Second Language
Research, 37(3), 379–407.
