In:Applying Corpora in Teaching and Learning Romance Languages
Edited by Henry Tyne and Stefania Spina
[Studies in Corpus Linguistics 122] 2025
► pp. 180–204
Chapter 8Data-driven learning effects on the development of Italian L2 phraseological competence
The combined role of semantic transparency and frequency
Published online: 20 November 2025
https://doi.org/10.1075/scl.122.08for
https://doi.org/10.1075/scl.122.08for
Abstract
This chapter presents a study focused on the analysis of second language phraseological competence
development in light of a data-driven learning (DDL) intervention and in the context of Italian language learning
courses. The analysis focuses on the role that specific properties (i.e., semantic transparency and frequency) of the
phraseological units considered (i.e., collocations) play in assessing the effects of DDL. The study involved a total
of 123 first language Chinese learners of Italian and was based on a between-groups design. The research questions of
the study move from a general picture of DDL effects (How does DDL influence the learning of collocations overall?) to
a more nuanced one (What effect does DDL have on learning semantically opaque and semantically transparent
collocations, also in relation to frequency?). The study demonstrates the value in considering the linguistic
properties of the learning aims, in evaluating the overall effects of a learning approach such as DDL. Implications
related to DDL research methodology are discussed.
Keywords: data-driven learning, Italian, phraseology, frequency, semantic transparency
Article outline
- 1.Introduction
- 2.The role of semantic transparency and frequency in learning collocations in an L2: Literature review
- 2.1Semantic transparency
- 2.2Frequency
- 3.The potential of concordance-based data-driven learning: Rationale and aims of the study
- 4.Method
- 4.1Study design
- 4.2Participants
- 4.3Identification and coding of collocations
- 4.4Development of the phraseological competence test
- 4.5Data analysis
- 5.Results
- 6.Discussion
- 7.Concluding remarks
Note References Appendix
References (58)
Altenberg, Bengt. 1998. On
the phraseology of spoken English: The evidence of recurrent
word-combinations. In Phraseology: Theory, Analysis
and Applications, Anthony P. Cowie (ed.), 101–122. Oxford: OUP.
Altenberg, Bengt & Granger, Sylviane. 2001. The
grammatical and lexical patterning of MAKE in native and non-native student
writing. Applied
Linguistics 22(2): 173–195.
Arvidsson, Klara. 2019. Quantity
of target language contact in study abroad and knowledge of multiword expressions. A usage-based approach to
L2 development. Study Abroad Research in Second Language Acquisition and
international
education 4(2): 145–167.
Bates, Douglas, Mächler, Martin, Bolker, Ben & Walker, Steve. 2015. Fitting
linear mixed-effects models using lme4. Journal of Statistical
Software 67(1).
Bestgen, Yves & Granger, Sylviane. 2014. Quantifying
the development of phraseological competence in L2 English writing: An automated
approach. Journal of Second Language
Writing 26: 28–41.
Boulton, Alex. 2008. DDL:
Reaching the parts other teaching can’t
reach? In Proceedings of the 8th Teaching and
Language Corpora Conference, Anna Frankenberg-Garcia (ed.), 38–44. Lisbon: Associação de Estudos e de Investigação Cientifíca do ISLA-Lisboa.
Boulton, Alex & Cobb, Tom. 2017. Corpus
use in language learning: A meta-analysis. Language
Learning 67(2): 348–393.
Chambers, Angela. 2019. Towards
the corpus revolution? Bridging the research–practice gap. Language
Teaching 52(4): 460–475.
Cunnings, Ian. 2012. An
overview of mixed-effects statistical models for second language
researchers. Second Language
Research 28(3): 369–382.
Cunnings, Ian & Finlayson, Ian. 2015. Mixed
effects modeling and longitudinal data
analysis. In Advancing Quantitative Methods in Second
Language, Luke Plonsky (ed.), 159–181. New York NY: Routledge.
Durrant, Philip. 2014. Corpus
frequency and second language learners’ knowledge of collocations: A
meta-analysis. International Journal of Corpus
Linguistics 19(4): 443–477.
Durrant, Philip & Schmitt, Norbert. 2009. To
what extent do native and non-native writers make use of
collocations? IRAL 47(2): 157–177.
. 2010. Adult
learners’ retention of collocations from exposure. Second Language
Research 26(2): 163–188.
Ellis, Nick C. 2002. Frequency
effects in language processing. Studies in Second Language
Acquisition 24(2): 143–188.
Ellis, Nick C., Simpson-Vlach, Rita, Römer, Ute, O’Donnell, Matthew Brook & Wulff, Stefanie. 2015. Learner
corpora and formulaic language in second language acquisition
research. In The Cambridge Handbook of Learner Corpus
Research, Sylviane Granger, Gaëtanelle Gilquin, & Fanny Meunier (eds), 357–378. Cambridge: CUP.
Erman, Britt & Warren, Beatrice. 2000. The
idiom principle and the open choice principle. Text — Interdisciplinary Journal
for the Study of
Discourse 20(1): 29–62.
Espinal, M. Teresa & Mateu, Jaume. 2019. Idioms
and phraseology. In Oxford Research Encyclopedia of
Linguistics.
Forti, Luciana. 2019a. Evaluating
data-driven learning effects in the Italian L2 classroom: Etic and emic perspectives
combined. EL.LE 8(2): 363–378.
. 2019b. Learner
attitudes towards data-driven learning: Investigating the effect of teaching
contexts. In CALL and Complexity — Short Papers from
EUROCALL 2019, Fanny Meunier, Julie Van de Vyver, Linda Bradley & Sylvie Thouësn (eds), 137–143).
. 2019c. Learning
Italian verb + noun collocations through data-driven learning: First insights into the role of semantic
transparency. In Widening the Scope of Learner Corpus
Research. Selected Papers from the Fourth Learner Corpus Research Conference, Andrea Abel, Aivars Glaznieks, Verena Lyding & Nicolas Lionel (eds), 167–197. Louvain-la-Neuve: Presses universitaires de Louvain. [URL]
Friginal, Eric. 2018. Corpus
Linguistics for English Teachers: Tools, Online Resources, and
Classroom. London: Routledge.
Gabrielatos, Costas. 2005. Corpora
and language teaching: Just a fling or wedding bells? The Electronic Journal
for English as a Second
Language 8(4): 1–32. [URL]
Gass, Susan M. & Selinker, Larry. 2008. Second
Language Acquisition. An Introductory Course, 3rd
edn. New York NY: Routledge.
Gilquin, Gaëtanelle. 2007. To
err is not all. What corpus and elicitation can reveal about the use of collocations by
learners. Zeitschrift für Anglistick und
Amerikanistick 55(3): 273–291.
Goulart, Larissa & Veloso, Ingrid (eds). 2023. Corpora
in English Language Teaching. Classroom Activities for Teachers New to Corpus
Linguistics. Little Falls NJ: Montclair State University.
Granger, Sylviane & Bestgen, Yves. 2014. The
use of collocations by intermediate vs. advanced non-native writers: A bigram-based
study. IRAL 52(3): 229–252.
Gyllstad, Henrik & Wolter, Brent. 2016. Collocational
processing in light of the phraseological Continuum Model: Does semantic transparency matter? Collocational
processing and semantic transparency. Language
Learning 66(2): 296–323.
Hasselgren, Angela. 1994. Lexical
teddy bears and advanced learners: A study into the ways Norwegian students cope with English
vocabulary. International Journal of Applied
Linguistics 4(2): 237–258.
Hayes, Andrew F. & Krippendorff, Klaus. 2007. Answering
the call for a standard reliability measure for coding data. Communication
Methods and
Measures 1(1): 77–89.
Henriksen, Birgit. 2013. Research
on L2 learners’ collocational competence and development — A progress
report. In L2 Vocabulary Acquisition, Knowledge and
Use. New Perspectives on Assessment and Corpus Analysis [Eurosla Monographs Series
2], Camilla Bardel, Christina Lindqvist & Batia Laufer (eds), 29–56. [URL]
Howarth, Peter A. 1998. Phraseology and
second language proficiency. Applied
Linguistics 19(1): 24–44.
Jaén, María Moreno. 2009. A
corpus-driven design of a test for assessing the ESL collocational competence of university
students. International Journal of English
Studies 7(2): 127–148.
Johns, Tim. 1991. Should
you be persuaded — Two examples of data driven learning materials. Classroom
Concordancing, English Language Research
Journal 4: 1–13.
Karpenko-Seccombe, Tatyana. 2021. Academic
Writing with Corpora: A Resource Book for Data-driven
Learning. London: Routledge.
Laufer, Batia & Waldman, Tina. 2011. Verb-noun
collocations in second language writing: A corpus analysis of learners’
English. Language
Learning 61(2): 647–672.
Le Foll, Elen (ed.). 2021. Creating
Corpus-informed Materials for the English as a Foreign Language Classroom. A Step-by-step Guide for (Trainee)
Teachers Using Online Resources, 3rd edn. [URL]
Lee, Hansol, Warschauer, Mark & Lee, Jang Ho. 2018. The
Effects of corpus use on second language vocabulary learning: A multilevel
meta-analysis. Applied
Linguistics 40(5): 721–753.
. 2020. Toward
the establishment of a data-driven learning model: Role of learner factors in corpus-based second language
vocabulary learning. The Modern Language
Journal 104(2): 345–362.
Leech, Geoffrey. 1997. Teaching
and language corpora: A convergence. In Teaching and
Language Corpora, Anne Wichmann, Steven Fligelstone, Tony McEnery & Gerry Knowles (eds), 1–23. London: Longman.
Linck, Jared A. & Cunnings, Ian. 2015. The
utility and application of mixed-effects models in second language research: Mixed-effects
models. Language
Learning 65(S1): 185–207.
Mizumoto, Atsushi & Chujo, Kiyomi. 2015. A
Meta-analysis of data-driven learning approach in the Japanese EFL
classroom. English Corpus
Studies 22: 1–18.
Nesselhauf, Nadja. 2003. The
use of collocations by advanced learners of English and some implications for
teaching. Applied
Linguistics 24(2): 223–242.
Paquot, Magali. 2018. Phraseological
competence: A missing component in university entrance language tests? Insights from a study of EFL learners’
use of statistical collocations. Language Assessment
Quarterly 15(1): 29–43.
Pérez-Paredes, Pascual. 2019. A
systematic review of the uses and spread of corpora and data-driven learning in CALL research during
2011–2015. Computer Assisted Language
Learning 35(1–2): 36–61.
Römer, Ute. 2008. Corpora
and language teaching. In Corpus Linguistics. An
International Handbook, Anka Lüdeling & Merja Kytö (eds), 112–131. Berlin: De Gruyter.
Spina, Stefania. 2014. Il
Perugia Corpus: Una risorsa di riferimento per l’italiano. Composizione, annotazione e
valutazione. Proceedings of the First Italian Conference on Computational
Linguistics CLiC-It 2014 & the Fourth International Workshop EVALITA
2014 1: 354–359.
Spina, Stefania & Siyanova-Chanturia, Anna. 2018. The
Longitudinal Corpus of Chinese Learners of Italian (LOCCLI). Poster presented
at the 13th TaLC
conference, Cambridge.
Tavares Pinto, Paula, Crosthwaite, Peter, Tavares de Carvalho, Carolina, Spinelli, Franciele, Serpa, Talita, Garcia, William & Ottaiano, Adriane O. (eds). 2023. Using
Language Data to Learn about Language: A Teachers’ Guide to Classroom Corpus
Use. Brisbane: The University of Queensland.
Tyne, Henry. 2017. Olfactory
vocabulary and collocation in French. Studii de
Lingvistică 7: 185–205. [URL]
. 2018. Corpus
et input en français langue seconde: Le cas des expressions
idiomatiques. In Dynamiques linguistiques: Variation,
évolution et cognition, Marie-Hélène Côté, Jacques Durand, Chantal Lyche & Julie Peuvergne (eds), 197–220. Nanterre: Presses universitaires de Paris Nanterre. [URL]
Van Lancker Sidtis, Diana. 2015. Formulaic
language in an emergentist framework. In The Handbook
of Language Emergence, Brian MacWhinney & William O’Grady (eds), 578–599). Chichester: Wiley & Sons.
Vyatkina, Nina (ed.). 2020. Incorporating
Corpora: Using Corpora to Teach German to English-Speaking Learners [Online
instructional materials]. Lawrence KS: KU Open language resource center.
Wang, Ying. 2016. The
Idiom Principle and L1 Influence. A Contrastive Learner-corpus Study of Delexical Verb+Noun
Collocations [Studies in Corpus Linguistics
77]. Amsterdam: John Benjamins.
