In:Complexity, Accuracy and Fluency in Learner Corpus Research
Edited by Agnieszka Leńko-Szymańska and Sandra Götz
[Studies in Corpus Linguistics 104] 2022
► pp. 115–136
Phraseological complexity in EFL learners’ spoken production across proficiency levels
Published online: 1 December 2022
https://doi.org/10.1075/scl.104.05paq
https://doi.org/10.1075/scl.104.05paq
Abstract
This study explores phraseological complexity in English as a Foreign Language (EFL) learners’ spoken production across proficiency levels in the Trinity Lancaster Corpus. Phraseological diversity and sophistication are operationalized as root type-token ratios and median mutual information scores of verb + object co-occurrences respectively. Results draw a complex picture of phraseological complexity in EFL learners’ oral performance, with phraseological diversity increasing on the whole and phraseological sophistication decreasing significantly from B1 to B2. These findings can at least partly be explained by the fact that, unlike EFL learners at B1, EFL learners at B2 and above repeat fewer combinations and use more specific vocabulary, which sometimes leads to less idiomatic combinations that should nevertheless be considered as traces of qualitative development.
Article outline
- 1.Introduction
- 2.Method
- 2.1Learner data
- 2.2Data preparation: Co-occurrence extraction and analysis
- 2.3Statistical evaluation
- 3.Results
- 4.Discussion
- 5.Conclusion
Acknowledgements Notes References
References (49)
Ackermann, Kristen & Chen, Yu-Hua. 2013. Developing the Academic Collocation List (ACL): A corpus-driven and expert-judged approach. Journal of English for Academic Purposes 12(4): 235–247.
Alexopoulou, Theodora, Michel, Marije, Murakami, Akira & Meurers, Detmar. 2017. Task effects on linguistic complexity and accuracy: A large-scale learner corpus analysis employing natural language processing techniques. Language Learning 67(S1): 180–208.
Barton, Kamil. 2020. MuMIn: Multi-Model Inference. R package version 1.43.17. <[URL]> (20 December 2021).
Biber, Douglas, & Gray, Bethany. 2013. Discourse characteristics of writing and speaking task types on the TOEFL iBT® test : A lexico-grammatical analysis. ETS Research Report Series 2013(1): i–128.
Biber Douglas, Johansson, Stig, Leech, Geoffrey, Conrad, Susan & Finegan, Edward. 1999. Longman Grammar of Spoken and Written English. Harlow: Longman. Also published as Biber, Douglas, Johansson, Stig, Leech, Geoffrey, Conrad, Susan, & Finegan, Edward. 2021. Grammar of Spoken and Written English. Amsterdam: John Benjamins.
Brezina, Vaclav & Fox, Lorrae. 2021. Adjective + noun collocations in L2 and L1 speech: Evidence from the Trinity Lancaster Corpus and the spoken BNC2014. In Perspectives on the Second Language Phrasicon: The View from Learner Corpora, Sylviane Granger (ed.), 152–177. Bristol: Multilingual Matters.
Bulté, Bram & Housen, Alex. 2012. Defining and operationalising L2 complexity. In Dimensions of L2 Performance and Proficiency: Complexity, Accuracy and Fluency in SLA [Language Learning & Language Teaching 32], Alex Housen, Folkert Kuiken & Ineke Vedder (eds), 23–46. Amsterdam: John Benjamins.
De Marneffe, Marie-Catherine & Manning, Christopher D. 2013. Stanford Typed Dependencies Manual. <[URL]> (30 April 2021).
Eguchi, Masaki & Kyle, Kristopher. 2020. Continuing to explore the multidimensional nature of lexical sophistication: The case of oral proficiency interviews. The Modern Language Journal 104(2): 381–400.
Ellis, Rod & Yuan, Fangyuan. 2005. The effects of careful within-task planning on oral and written task performance. In Planning and Task Performance in a Second Language, Rod Ellis (ed.), 167–192. Amsterdam: John Benjamins.
Fox, John & Weisberg, Sanford. 2019. An R Companion to Applied Regression (3rd edn). Thousand Oaks CA: Sage. <[URL]> (20 December 2021).
Gablasova, Dana, Brezina, Vaclav & McEnery, Tony. 2017. Collocations in corpus-based language learning research: Identifying, comparing, and interpreting the evidence. Language Learning 67(S1): 155–179.
. 2019. The Trinity Lancaster Corpus: Development, description and application. International Journal of Learner Corpus Research 5(2): 126–158.
Granger, Sylviane. 1998. Prefabricated patterns in advanced EFL writing: Collocations and formulae. In Phraseology: Theory, Analysis and Applications, Anthony Cowie (ed.), 145–160. Oxford: OUP.
Granger, Sylviane & Bestgen, Yves. 2017. Using collgrams to assess L2 phraseological development: A replication study. In Language, Learners and Levels: Progression and Variation, Pieter de Haan, Rina de Vries & Sanne van Vuuren (eds), 385–408. Louvain-la-Neuve: Presses universitaires de Louvain.
Granger, Sylviane & Paquot, Magali. 2008. Disentangling the phraseological web. In Phraseology: An Interdisciplinary Perspective, Sylviane Granger & Fanny Meunier (eds), 27–49. Amsterdam: John Benjamins.
Gries, Stefan T. 2018. On over- and underuse in learner corpus research and multifactoriality in corpus linguistics more generally. Journal of Second Language Studies 1(2): 277–309.
2019. Priming of syntactic alternations by learners of English: An analysis of sentence-completion and collostructional results. In Using Corpus Methods to Triangulate Linguistic Analysis, Jesse Egbert & Paul Baker (eds), 219–238. New York NY: Routledge.
Jones, Christian, Byrne, Shelley & Halenko, Nicola. 2017. Successful Spoken English: Findings from Learner Corpora. New York NY: Routledge.
Kilgarriff, Adam, Baisa, Vít, Bušta, Jan, Jakubíček, Miloš, Kovář, Vojtěch, Michelfeit, Jan, Rychlý, Pavel & Suchomel, Vít. 2014. The Sketch Engine: Ten years on. Lexicography 1: 7–36.
Kuznetsova, Alexandra, Brockhoff, Per B., Christensen, Rune H. B. 2017. lmerTest Package: Tests in Linear Mixed Effects Models. Journal of Statistical Software 82(13): 1–26.
Kyle, Kristopher & Crossley, Scott. 2015. Automatically assessing lexical sophistication: Indices, tools, findings, and application. TESOL Quarterly 49(4): 757–786.
Laufer, Batia. 1994. The lexical profile of second language writing: Does it change over time? RELC Journal 25: 21–33.
Laufer, Batia & Nation, Paul. 1995. Vocabulary size and use: Lexical richness in L2 written production. Applied Linguistics 16(3): 307–322.
Laufer, Batia & Waldman, Tina. 2011. Verb-noun collocations in second language writing: A corpus analysis of learners’ English. Language Learning 61(2): 647–672.
Leech, Geoffrey. 2000. Grammars of spoken English: New outcomes of corpus-oriented research. Language Learning 50(4): 675–724.
Lüdecke, Daniel, Ben-Shachar, Mattan S., Patil, Indrajeet, Waggoner, Philip & Makowski, Dominique. 2021. Performance: An R Package for Assessment, Comparison and Testing of Statistical Models. Journal of Open Source Software 6(60): 3139.
McDonough, Kim & Kim, Youjin. 2009. Syntactic priming, type frequency, and EFL learners’ production of wh-questions. The Modern Language Journal 93(3): 386–398.
Nesselhauf, Nadja. 2005. Collocations in a Learner Corpus [Studies in Corpus Linguistics 14]. Amsterdam: John Benjamins.
Nippold, Marilyn A., Ward-Lonergan, Jeannene M. & Fanning, Jessica L. 2015. Persuasive writing in children, adolescents, and adults: A study of syntactic, semantic, and pragmatic development. Language, Speech, and Hearing Services in Schools 36: 125–138.
Ortega, Lourdes. 2003. Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis of college-level L2 writing. Applied Linguistics 24(4): 492–518.
Paquot, Magali. 2018. Phraseological competence: A useful toolbox to delimitate CEFR levels in higher education? Insights from a study of EFL learners’ use of statistical collocations. Language Assessment Quarterly 15(1): 29–43.
. 2019. The phraseological dimension in interlanguage complexity research. Second Language Research 35(1): 121–145.
Paquot, Magali, Hasselgård, Hilde & Oksefjell Ebeling, Signe. 2013. Writer/reader visibility in learner writing across genres: A comparison of the French and Norwegian components of the ICLE and VESPA learner corpora. In Twenty Years of Learner Corpus Research: Looking Back, Moving Ahead [Corpora and Language in Use – Proceedings 1], Sylviane Granger, Gaëtanelle Gilquin & Fanny Meunier (eds), 377–387. Louvain-la-neuve: Presses universitaires de Louvain.
Paquot, Magali, Naets, Hubert & Gries, Stefan T. 2021. Using syntactic co-occurrences to trace phraseological complexity development in learner writing: verb + object structures in LONGDALE. In Learner Corpus Research Meets Second Language Acquisition, Bert Le Bruyn & Magali Paquot (eds), 122–147. Cambridge: CUP.
Römer, Ute & Garner, James R. 2019. The development of verb constructions in spoken learner English: Tracing effects of usage and proficiency. International Journal of Learner Corpus Research 5(2): 207–230.
Rubin, Rachel, Housen, Alex & Paquot, Magali. 2021. Phraseological complexity as an index of L2 Dutch writing proficiency: A partial replication study. In Perspectives on the Second Language Phrasicon: The View from Learner Corpora, Sylviane Granger (ed.), 101–125. Bristol: Multilingual Matters.
Schäfer, Roland. 2015. Processing and querying large web corpora with the COW14 architecture. In Proceedings of the 3rd Workshop on Challenges in the Management of Large Corpora (CMLC-3), Lancaster, 20 July 2015, Piotr Bański, Hanno Biber, Evelyn Breiteneder, Marc Kupietz, Harald Lüngen & Andreas Witt (eds), 28–34. Mannheim: Institut für Deutsche Sprache. <[URL]> (14 October 2021).
Schäfer, Roland & Bildhauer, Felix. 2012. Building large corpora from the web using a new efficient tool chain. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12), Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk & Stelios Piperidis (eds), 486–493. European Language Resources Association (ELRA) <[URL]> (14 October 2021).
Thewissen, Jennifer. 2013. Capturing L2 accuracy developmental patterns: Insights from an error-tagged EFL learner corpus. The Modern Language Journal 97(S1): 77–101.
Vandeweerd, Nathan, Housen, Alex & Paquot, Magali. 2021. Applying phraseological complexity measures to L2 French: A partial replication study. International Journal of Learner Corpus Research 7(2): 197–228.
. Submitted. Comparing the longitudinal development of phraseological complexity across oral and written tasks.
Vasylets, Olena, Gilabert, Roger & Manchón, Rosa M. 2017. The effects of mode and task complexity on second language production. Language Learning 67(2): 394–430.
Venables, William N. & Ripley, Brian D. 2002. Modern Applied Statistics with S (4th edn). New York NY: Springer.
Cited by (6)
Cited by six other publications
Bottini, Raffaella & Elen Le Foll
2025. The more proficient the learners, the less sophisticated their L2 vocabulary?. International Journal of Learner Corpus Research 11:1 ► pp. 47 ff.
Gablasova, Dana & Vaclav Brezina
2025. Adjective + noun collocations in L2 spoken English. International Journal of Learner Corpus Research 11:1 ► pp. 79 ff.
Jiang, Yuyu & Hua Chen
Shadrova, Anna
Riemenschneider, Anja, Zarah Weiss, Pauline Schröter & Detmar Meurers
This list is based on CrossRef data as of 1 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
