Article published In: Learner Corpus Research for Pedagogical Purposes
Edited by Sandra Götz and Sylviane Granger
[International Journal of Learner Corpus Research 10:1] 2024
► pp. 216–240
Position paper
Proficiency-rated learner corpora
A promising resource for data-driven learning
Published online: 28 June 2024
https://doi.org/10.1075/ijlcr.00045.for
https://doi.org/10.1075/ijlcr.00045.for
Abstract
In this position paper, I argue that proficiency-rated learner corpora should gain a more prominent role in
data-driven learning (DDL). With specific reference to DDL, proficiency-rated learner corpora can provide typical, atypical and
erroneous target language data at different levels of proficiency, which can be meaningfully used in the design of learning
activities. This makes them pivotal in expanding the scope of DDL to include mid- and lower-level proficiency learners more
extensively. Although the field of learner corpus research has been promoting learner corpus use in DDL for a long time, only a
small fraction of DDL studies make use of a learner corpus. As a contribution to overcome this hiatus, I will demonstrate how
using a specific proficiency-rated learner corpus (i.e., the CELI corpus; Spina, S., Fioravanti, I., Forti, L., Santucci, V., Scerra, A., & Zanda, F. (2022). Il
Corpus CELI: Una nuova risorsa per studiare l’acquisizione dell’italiano L2. Italiano
LinguaDue, 14(1), 116–138. , Spina, S., Fioravanti, I., Forti, L., & Zanda, F. (2023). The
CELI corpus: Design and linguistic annotation of a new online learner corpus. Second Language
Research, Ahead of print. [URL]) can enrich the design of DDL activities, making
them more adaptable to a wider range of learner needs.
Keywords: CEFR, Italian, learner corpus, data-driven learning, proficiency
Article outline
- 1.Introduction
- 2.The overlooked pedagogical potential of proficiency-rated learner corpora
- 3.The CELI corpus
- 4.A learner-corpus-based activity on motion verb constructions
- 4.1Materials preparation
- 4.2Activity stages
- Stage 1: Matching activity
- Stage 2: Identify the errors
- Stage 3: Correct the errors
- 5.Discussion and concluding remarks
- Notes
References
References (42)
Baisa, V., & Suchomel, V. (2014). SkELL:
Web interface for English language learning. Eighth Workshop on Recent Advances in Slavonic
Natural Language Processing, 63–70.
Bestgen, Y., & Granger, S. (2014). Quantifying
the development of phraseological competence in L2 English writing: An automated
approach. Journal of Second Language
Writing, 261, 28–41.
(2018). Tracking
L2 writers’ phraseological development using collgrams: Evidence from a longitudinal EFL
corpus. In S. Hoffmann, A. Sand, S. Arndt-Lappe, & L. M. Dillmann (Eds.), Corpora
and
Lexis (pp. 277–301). Brill.
Boulton, A., & Cobb, T. (2017). Corpus
use in language learning: A meta-analysis. Language
Learning, 67(2), 348–393.
Boulton, A., & Vyatkina, N. (2021). Thirty
years of data-driven learning: Taking stock and charting new directions. Language Learning and
Technology, 25(3), 66–89.
Boyd, A., Hana, J., Nicolas, L., Meurers, D., Wisniewski, K., Abel, A., Schöne, K., Štindlová, B., & Vettori, C. (2014). The
MERLIN corpus: learner language and the CEFR. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings
of the 9th International Conference on Language Resources and Evaluation (LREC
2014) (pp. 1281–1288), European Language Resources Association (ELRA).
Carlsen, C. (2012). Proficiency
level – A fuzzy variable in computer learner corpora. Applied
Linguistics, 33(2), 161–183.
Casani, E. (2020). Valutare
la competenza morfosintattica in italiano L2. Una validazione corpus-based dei livelli del
QCER. In E. Nuzzo, E. Santoro, & L. Vedder (Eds.), Valutazione
e misurazione delle produzioni orali e scritte in italiano lingua
seconda (pp. 15–26). Cesati.
Chambers, A. (2019). Towards
the corpus revolution? Bridging the research–practice gap. Language
Teaching, 52(4), 460–475.
Cole, M. W. (2014). Speaking
to read: Meta-analysis of peer-mediated learning for English language learners. Journal of
Literacy
Research, 46(3), 358–382.
Forti, L. (2023). Learner
corpora and the design of data-driven learning activities. In B. Bédi, Y. Choubsaz, K. Friðriksdóttir, A. Gimeno-Sanz, S. Björg Vilhjálmsdóttir, & S. Zahova (Eds.), CALL
for all Languages – EUROCALL 2023 Short Papers. University of Iceland, Reykjavik, August
15–18 (pp. 139–144). Editorial Universitat Politècnica de València.
Forti, L., Bolli, G. G., Santarelli, F., Santucci, V., & Spina, S. (2020). MALT-IT2:
A new resource to measure text difficulty in light of CEFR levels for Italian L2
learning. In N. Calzolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings
of the 12th Language Resources and Evaluation
Conference (pp. 7206–7213). European Language Resources Association (ELRA).
Frey, J. C., König, A., Stemle, E. W., & Paquot, M. (2023, August/September). A
core metadata schema for L2 data. Poster presented at the EuroSLA 32
conference 2023, University of Birmingham, United Kingdom.
Friginal, E. (2018). Corpus
linguistics for English teachers: Tools, online resources, and classroom
activities. Routledge.
Gilquin, G. (2023). Written
learner corpora to inform teaching. In R. R. Jablonkai, & E. Csomay (Eds.), The
Routledge handbook of corpora and English language teaching and
learning (pp. 281–295). Routledge.
Gilquin, G., & Granger, S. (2022). Using
data-driven learning in language teaching. In A. O’Keeffe, & M. J. McCarthy (Eds.), The
Routledge handbook of corpus linguistics (2nd
ed., pp. 430–442). Routledge.
Glaznieks, A., Frey, J.-C., Stopfner, M., Zanasi, L., & Nicolas, L. (2022). LEONIDE:
A longitudinal trilingual corpus of young learners of Italian, German and
English. International Journal of Learner Corpus
Research, 8(1), 97–120.
Götz, S. (2022, August). Learner
corpora and DDL: A promising synergy? Paper presented at the CorpusCALL
SIG symposium “DDL and learner corpora” as part of the EuroCALL conference 2022 (online), University of
Iceland, Iceland.
Goulart, L., & Veloso, I. (Eds.). (2023). Corpora
in English language teaching. Classroom activities for teachers new to corpus linguistics. Open
Educational Resource. Montclair State University.
Granger, S. (1996). From
CA to CIA and back: An integrated approach to computerized bilingual and learner
corpora. In K. Aijmer, B. Altenberg, & M. Johansson (Eds.), Languages
in contrast: Papers from a symposium on text-based cross-linguistic studies: Lund 4–5 March
1994 (pp. 37–51). Lund University Press.
(2015). Contrastive
interlanguage analysis: A reappraisal, International Journal of Learner Corpus
Research, 1(1), 7–24.
(2009). The
contribution of learner corpora to Second Language Acquisition and foreign language teaching: A critical
evaluation. In K. Aijmer (Ed.), Corpora
and Language
Teaching (pp. 13–33). John Benjamins.
Granger, S., Dupont, M., Meunier, F., Naets, H., & Paquot, M. (Eds.). (2020). International
Corpus of Learner English. Version 3. Presses universitaires de Louvain.
Granger, S., & Paquot, M. (2017, December). Towards
standardization of metadata for L2 corpora. Invited talk at the CLARIN
workshop on Interoperability of Second Language Resources and Tools, University of
Gothenburg, Sweden.
Gyllstad, H., & Snoder, P. (2021). Exploring
learner corpus data for language testing and assessment purposes: The case of verb + noun
collocations. In S. Granger (Ed.), Perspectives
on the L2
phrasicon (pp. 49–71). Multilingual Matters.
Johns, T. (1991). Should
you be persuaded – Two examples of data driven learning
materials. In J. Johns, & P. King (Eds.), Classroom
concordancing, English Language Research
Journal, 41, 1–16.
La Russa, F., D’Alesio, V., & Suadoni, A. (2023). Designing
a corpus-based syllabus of Italian collocations: Criteria, methods and procedure, Revue
Roumaine de
Linguistique, 41, 377–389.
Le Foll, E. (2021). Creating
corpus-informed materials for the English as a foreign language classroom. A step-by-step guide for (trainee) teachers using
online resources (Third Edition). Open Educational Resource. [URL]. CC-BY-NC
4.0.
Lee, H., Warschauer, M., & Lee, J. H. (2018). The
effects of corpus use on second language vocabulary learning: A multilevel
meta-analysis. Applied
Linguistics, 40(5), 721–753.
Mizumoto, A., & Chujo, K. (2015). A
meta-analysis of data-driven learning approach in the Japanese EFL classroom. English Corpus
Studies, 221, 1–18.
Paquot, M., Rubin, R., & Vandeweerd, N. (2022). Crowdsourced
adaptive comparative judgment: A community-based solution for proficiency rating. Language
Learning, 72(3), 853–885.
Pérez-Paredes, P. (2022). A
systematic review of the uses and spread of corpora and data-driven learning in CALL research during
2011–2015. Computer Assisted Language
Learning, 35(1–2), 36–61.
Pérez-Paredes, P., Ordoñana Guillamón, C., Van de Vyver, J., Meurice, A., Aguado Jiménez, P., Conole, G., & Sánchez Hernández, P. (2019). Mobile
data-driven language learning: Affordances and learners’
perception. System, 841, 145–159.
Poole, R. (2018). A
guide to using corpora for English language learners. Edinburgh University Press.
Seidlhofer, B. (2002). Pedagogy
and local learner corpora: Working with learning-driven
data. In S. Granger, J. Hung, & S. Petch-Tyson (Eds.), Computer
learner corpora, Second Language Acquisition and foreign language
teaching (pp. 213–234). John Benjamins.
Shatz, I. (2020). Refining
and modifying the EFCAMDAT: Lessons from creating a new corpus from an existing large-scale English learner language
database. International Journal of Learner Corpus
Research, 6(2), 220–236.
Spina, S., Fioravanti, I., Forti, L., Santucci, V., Scerra, A., & Zanda, F. (2022). Il
Corpus CELI: Una nuova risorsa per studiare l’acquisizione dell’italiano L2. Italiano
LinguaDue, 14(1), 116–138.
Spina, S., Fioravanti, I., Forti, L., & Zanda, F. (2023). The
CELI corpus: Design and linguistic annotation of a new online learner corpus. Second Language
Research, Ahead of print. [URL]
Vyatkina, N. (2020). Corpora
as open educational resources for language teaching. Foreign Language
Annals, 53(2), 359–370.
