Article published In: International Journal of Learner Corpus Research
Vol. 7:2 (2021) ► pp.197–229
Applying phraseological complexity measures to L2 French
A partial replication study
Published online: 11 October 2021
https://doi.org/10.1075/ijlcr.20015.van
https://doi.org/10.1075/ijlcr.20015.van
Abstract
This study partially replicates Paquot’s (Paquot, M. (2018). Phraseological
competence: A missing component in university entrance language tests? Insights from a study of EFL learners’ use of
statistical collocations. Language Assessment
Quarterly, 15(1), 29–43. , (2019). The
phraseological dimension in interlanguage complexity research. Second Language
Research, 35(1), 121–145. ) study of phraseological complexity in L2 English by investigating how phraseological complexity
compares across proficiency levels as well as how phraseological complexity measures relate to lexical, syntactic and
morphological complexity measures in a corpus of L2 French argumentative essays. Phraseological complexity is operationalized as
the diversity (root type-token ratio; RTTR) and sophistication (pointwise mutual information; PMI) of three
types of grammatical dependencies: adjectival modifiers, adverbial modifiers and direct objects. Results reveal a significant
increase in the mean PMI of direct objects and the RTTR of adjectival modifiers across proficiency levels. In
addition to phraseological sophistication, important predictors of proficiency include measures of lexical diversity, lexical
sophistication, syntactic (phrasal) complexity and morphological complexity. The results provide cross-linguistic validation for
the results of Paquot (Paquot, M. (2018). Phraseological
competence: A missing component in university entrance language tests? Insights from a study of EFL learners’ use of
statistical collocations. Language Assessment
Quarterly, 15(1), 29–43. , (2019). The
phraseological dimension in interlanguage complexity research. Second Language
Research, 35(1), 121–145. ) and
further highlight the importance of including phraseological measures in the current repertoire of L2 complexity measures.
Keywords: L2 French, replication, phraseology, collocations, complexity, CEFR
Article outline
- 1.Introduction
- 2.Complexity research in L2 French
- 3.Data and method
- 3.1Learner data
- 3.2Complexity measures
- 3.2.1Phraseological complexity
- 3.2.2Lexical complexity
- 3.2.3Syntactic complexity
- 3.2.4Morphological complexity
- 3.3Analysis
- 4.Results
- 4.1Phraseological measures (RQ1)
- 4.2Random forest model (RQ2)
- 5.Discussion
- 5.1How does phraseological complexity compare in written L2 French at different proficiency levels (RQ1)?
- 5.2To what extent does phraseological complexity relate to lexical, syntactic and morphological complexity in L2 written French (RQ2)?
- 6.Conclusion
- Acknowledgements
- Notes
- Supplementary materials
References
References (64)
Ågren, M., Granfeldt, J., & Schlyter, S. (2012). The
growth of complexity and accuracy in L2 French: Past observations and recent applications of developmental
stages. In A. Housen, F. Kuiken, & I. Vedder (Eds.), Dimensions
of L2 performance and proficiency: Complexity, accuracy and fluency in
SLA (pp. 95–120). Amsterdam: John Benjamins.
Bartning, I., & Schlyter, S. (2004). Itinéraires acquisitionnels et stades de développement en français L2. Journal of French
Language
Studies, 14(3), 281–299.
Batista, R., & Horst, M. (2016). A
new receptive vocabulary size test for French. The Canadian Modern Language Review, 72(2), 211–233.
Bestgen, Y., & Granger, S. (2018). Tracking
L2 writers’ phraseological development using collgrams: Evidence from a longitudinal EFL
corpus. In S. Hoffmann, A. Sand, S. Arndt-Lappe, & L. M. Dillmann (Eds.), Corpora
and
lexis (pp. 277–301). Leiden: Brill Rodopi.
Blanche-Benveniste, C., & Adam, J.-P. (1999). La conjugaison des verbes: Virtuelle, attestée, defective. Recherches Sur Le Français
Parlé, 151, 87–112.
Bulté, B. (2013). The
development of complexity in second language acquisition: A dynamic systems
approach (Unpublished doctoral dissertation). Vrije Universiteit Brussel, Brussels, Belgium.
Bulté, B., & Housen, A. (2012). Defining
and operationalising L2 complexity. In A. Housen, I. Vedder, & F. Kuiken (Eds.), Dimensions
of L2 performance and proficiency: Complexity, accuracy and fluency in
SLA (pp. 21–46). Amsterdam and Philadelphia: John Benjamins.
Candito, M., Nivre, J., Denis, P., & Anguiano, E. H. (2010). Benchmarking
of statistical dependency parsers for French. Proceedings of the 23rd International Conference
on Computational Linguistics (COLING 2010: Poster
Volume), 108–116.
Church, K. W., & Hanks, P. (1989). Word
association norms, mutual information, and lexicography. Proceedings of the 27th Annual Meeting
on Association for Computational Linguistics, 76–83.
Cobb, T., & Horst, M. (2004). Is
there room for an academic word list in French? In P. Bogaards & B. Laufer (Eds.), Vocabulary
in a second language : Selection, acquisition, and
testing (pp. 15–38). Amsterdam: John Benjamins.
Council of Europe. (2001). The common
European framework of reference for languages: Learning, teaching,
assessment. Cambridge: Cambridge University Press.
De Clercq, B. (2015). The
development of lexical complexity in second language acquisition: A cross-linguistic study of L2 French and
English. EUROSLA
Yearbook, 15(1), 69–94.
(2016). The development of linguistic complexity: A comparative study on L2 French and L2 English (Unpublished doctoral
dissertation). Vrije Universiteit Brussel, Brussels, Belgium.
De Clercq, B., & Housen, A. (2017). A
cross-linguistic perspective on syntactic complexity in L2 development: Syntactic elaboration and
diversity. The Modern Language Journal, 101(2), 315–334.
(2019). The
development of morphological complexity: A cross-linguistic study of L2 French and
English. Second Language
Research, 35(1), 71–97.
Demol, A., & Hadermann, P. (2008). An
exploratory study of discourse organisation in French L1, Dutch L1, French L2 and Dutch L2 written
narratives. In G. Gilquin, S. Papp, & M. B. Díez-Bedmar (Eds.), Linking
up contrastive and learner corpus
research (pp. 255–282). Amsterdam: Brill.
Denis, P., & Sagot, B. (2012). Coupling
an annotated corpus and a lexicon for state-of-the-art POS tagging. Language Resources and
Evaluation, 461, 721–736.
Durrant, P., & Schmitt, N. (2009). To
what extent do native and non-native writers make use of collocations? International Review of
Applied Linguistics in Language
Teaching, 47(2), 157–177.
Erman, B., Denke, A., Fant, L., & Forsberg Lundell, F. (2015). Nativelike
expression in the speech of long-residency L2 users: A study of multiword structures in L2 English, French and
Spanish. International Journal of Applied
Linguistics, 25(2), 160–182.
Forsberg, F., & Bartning, I. (2010). Can
linguistic features discriminate between the communicative CEFR-levels?: A pilot study of written L2
French. In I. Bartning, M. Martin, & I. Vedder (Eds.), Communicative
proficiency and linguistic development: Intersections between SLA and language
testing (pp. 133–157). European Second Language Association.
Forsberg Lundell, F., Lindqvist, C., & Edmonds, A. (2018). Productive
collocation knowledge at advanced CEFR levels: Evidence from the development of a test for advanced L2
French. The Canadian Modern Language Review, 74(4), 627–649.
Garner, J., Crossley, S. A., & Kyle, K. (2018). N-gram
measures and L2 writing
proficiency. System, 801, 176–187.
Granger, S., & Bestgen, Y. (2014). The
use of collocations by intermediate vs. advanced non-native writers: A bigram-based
study. International Review of Applied Linguistics in Language
Teaching, 52(3), 229–252.
Greenwell, B. (2017). pdp:
An R package for constructing partial dependence plots. The R
Journal, 9(1), 421–436.
Greenwell, B., Boehmke, B., & Gray, B. (2019). vip:
Variable importance plots. Retrieved from [URL]
Guiraud, P. (1954). Les charactères statistiques du vocabulaire. Paris: Presses Universitaires de France.
Gyllstad, H., Granfeldt, J., Bernardini, P., & Källkvist, M. (2014). Linguistic
correlates to communicative proficiency levels of the CEFR: The case of syntactic complexity in written L2 English, L3 French
and L4 Italian. EuroSLA
Yearbook, 14(1), 1–30.
Hothorn, T., Buehlmann, P., Dudoit, S., Molinaro, A., & Van Der Laan, M. (2006). Survival
ensembles. Biostatistics, 7(3), 355–373.
Hunston, S., & Francis, G. (2000). Pattern
grammar. Amsterdam: John Benjamins.
Levshina, N. (2015). How
to do linguistics with R: Data exploration and statistical
analysis. Amsterdam: John Benjamins.
Lindqvist, C., Gudmundson, A., & Bardel, C. (2013). A
new approach to measuring lexical sophistication in L2 oral
production. In C. Bardel, C. Lindqvist, & B. Laufer (Eds.), L2
vocabulary acquisition, knowledge and use: New perspectives on assessment and corpus analysis (Eurosla
Monographs Series
2) (pp. 109–126). European Second Language Association.
Lonsdale, D., & Le Bras, Y. (2009). A frequency dictionary of French: Core vocabulary for learners. New York: Routledge.
Norris, J. M., & Ortega, L. (2009). Towards
an organic approach to investigating CAF in instructed SLA: The case of complexity. Applied
Linguistics, 30(4), 555–578.
Ortega, L. (2012). Interlanguage
complexity: A construct in search of theoretical renewal. In B. Szmrecsanyi & B. Kortmann (Eds.), Linguistic
complexity: Second language acquisition, indigenization,
contact (pp. 127–155). Berlin: De Gruyter.
Ovtcharov, V., Cobb, T., & Halter, R. (2006). La richesse lexicale des productions orales: Mesure fiable du niveau de compétence langagière. The Canadian Modern Language Review, 63(1), 107–125.
Pallotti, G. (2015). A
simple view of linguistic complexity. Second Language
Research, 31(11), 117–134.
Paquot, M. (2018). Phraseological
competence: A missing component in university entrance language tests? Insights from a study of EFL learners’ use of
statistical collocations. Language Assessment
Quarterly, 15(1), 29–43.
(2019). The
phraseological dimension in interlanguage complexity research. Second Language
Research, 35(1), 121–145.
Paquot, M., & Granger, S. (2012). Formulaic
language in learner corpora. Annual Review of Applied
Linguistics, 321(2012), 130–149.
Peters, E., Velghe, T., & Van Rompaey, T. (2019). The
VocabLab tests: The development of an English and French vocabulary test. International Journal
of Applied
Linguistics, 170(1), 53–78.
Plonsky, L., & Oswald, F. L. (2014). How
big is “big”? Interpreting effect sizes in L2 research. Language
Learning, 64(4), 878–912.
Porte, G. (2012). Replication
research in applied linguistics. Cambridge: Cambridge University Press.
R Core Team. (2019). R: A language and
environment for statistical computing. Vienna, Austria. Retrieved on 27 July
2021 from [URL]
Rosenthal, R. (1994). Parametric
measures of effect size. In H. Cooper & L. V. Hedges (Eds.), The
Handbook of Research
Synthesis (pp. 231–244). New York, NY: Russell Sage Foundation.
Rubin, R., Housen, A., & Paquot, M. (2021). Phraseological complexity as an index of L2 Dutch writing proficiency: A partial
replication study. In S. Granger (Ed.), Perspectives
on the L2 phrasicon: The view from learner corpora (pp. 101–125). Bristol: Multilingual Matters.
Schäfer, R. (2015). Processing
and querying large web corpora with the COW14 architecture. In P. Bański, H. Biber, E. Breiteneder, M. Kupietz, H. Lüngen, & A. Witt (Eds.), Proceedings
of the 3rd Workshop on Challenges in the Management of Large Corpora
(CMLC-3), 28–34.
Schäfer, R., & Bildhauer, F. (2012). Building
large corpora from the web using a new efficient tool chain. In N. Calzolari, K. Choukri, T. Declerck, M. U. Doğan, B. Maegaard, J. Mariani, A. Moreno, J. Odijk & S. Piperidis (Eds.), Proceedings
of the 8th International Conference on Language Resources and Evaluation
(LREC’12), 486–493.
Skehan, P. (2009). Modelling
second language performance: Integrating complexity, accuracy, fluency, and lexis. Applied
Linguistics, 30(4), 510–532.
Staples, S., Egbert, J., Biber, D., & Gray, B. (2016). Academic
writing development at the university level: Phrasal and clausal complexity across level of study, discipline, and
genre. Written
Communication, 33(2), 149–183.
Stengers, H., Boers, F., Housen, A., & Eyckmans, J. (2011). Formulaic
sequences and L2 oral proficiency: Does the type of target language influence the
association? International Review of Applied Linguistics in Language
Teaching, 49(4), 321–343.
Strobl, C., Boulesteix, A.-L., Kneib, T., Zeileis, T., & Achim, A. (2008). Conditional
variable importance for random forests. BMC
Bioinformatics, 9(307).
Strobl, C., Boulesteix, A.-L., Zeileis, A., & Hothorn, T. (2007). Bias
in random forest variable importance measures: Illustrations, sources and a solution. BMC
Bioinformatics, 8(25).
Treffers-Daller, J. (2013). Measuring
lexical diversity among L2 learners of French: an exploration of the validity of D, MTLD and HD-D as measures of language
ability. In S. Jarvis & M. Daller (Eds.), Vocabulary
knowledge: Human ratings and automated
measures (pp. 79–104). Amsterdam: John Benjamins.
Tutin, A., & Grossman, F. (2014). L’écrit scientifique: Du lexique au discours. Autour de Scientext [Scientific writing: From lexis to discourse. Overview of
Scientext]. Rennes: Presses Universitaires de Rennes.
Vanderbauwhede, G. (2012). Le déterminant démonstratif en français et en néerlandais à travers les corpus: Théorie, description,
acquisition (Unpublished doctoral
dissertation). Katholieke Universiteit Leuven, Leuven, Belgium; Université Paris Ouest Nanterre La Défense, Paris, France.
Vandeweerd, N. (2021). fsca: French syntactic complexity analyzer. International Journal of Learner Corpus Research, 7(2), 259–274.
Vanhove, J. (2018). Computer
code for cleaning, tagging, and analysing the texts. Retrieved
on 27 July
2021 from [URL]
Verspoor, M., Schmid, M. S., & Xu, X. (2012). A
dynamic usage based perspective on L2 writing. Journal of Second Language
Writing, 21(3), 239–263.
Vinay, J.-P., & Darbelnet, J. (1995). Comparative
stylistics of French and English. Translated by J. Sager & M.-J. Hamel. Amsterdam: John Benjamins.
Welcomme, A. (2013). Jonction interpropositionnelle et complexité syntaxique dans les récits d’apprenants néerlandophones et
locuteurs natifs du français. In U. Paprocka-Piotrowska, C. Martinot, & S. Gerolimich (Eds.), La complexité en langue et son acquisition [The Complexity of
language and its
acquisition] (pp. 261–284). Lublin: Towarzystwo Naukowe Kul Katolicki Uniwersytet Jana Pawla II.
Cited by (20)
Cited by 20 other publications
Brasolin, Paolo & Arianna Bienati
Fioravanti, Irene
Gablasova, Dana & Vaclav Brezina
2025. Adjective + noun collocations in L2 spoken English. International Journal of Learner Corpus Research 11:1 ► pp. 79 ff.
Gao, Wei & Bingliang Xu
Kopotev, Mikhail, Olesya Kisselev & Anton Vakhranev
Larsson, Tove & Douglas Biber
2025. Encouraging cumulative knowledge building as normal practice in (learner) corpus research. International Journal of Learner Corpus Research 11:1 ► pp. 1 ff.
Lindqvist, Christina
2025. Examining vocabulary knowledge in languages other than English. In Approaches and Methods in French Second Language Acquisition Research [Research Methods in Applied Linguistics, 9], ► pp. 95 ff.
Paquot, Magali & Hubert Naets
2025. Phraseological sophistication as a multidimensional construct. International Journal of Learner Corpus Research 11:1 ► pp. 217 ff.
Rubin, Rachel, Bram Bulté, Magali Paquot & Alex Housen
Spina, Stefania
Vandeweerd, Nathan & Klara Arvidsson
2025. Not just quantity but quality. Study Abroad Research in Second Language Acquisition and International Education 10:1 ► pp. 102 ff.
Vandeweerd, Nathan, Fanny Forsberg Lundell & Klara Arvidsson
Hartmann, Stefan
2024. Open Corpus Linguistics – or How to overcome common problems in dealing
with corpus data by adopting open research practices. In Challenges in corpus linguistics [Studies in Corpus Linguistics, 118], ► pp. 89 ff.
Mak, Matthew H.C.
Appel, Randy, Angel Arias, Beverly Baker & Guillaume Loignon
Eguchi, Masaki & Kristopher Kyle
Vandeweerd, Nathan, Alex Housen & Magali Paquot
Vandeweerd, Nathan, Alex Housen & Magali Paquot
Paquot, Magali, Dana Gablasova, Vaclav Brezina & Hubert Naets
2022. Phraseological complexity in EFL learners’ spoken production across proficiency levels. In Complexity, Accuracy and Fluency in Learner Corpus Research [Studies in Corpus Linguistics, 104], ► pp. 115 ff.
Vandeweerd, Nathan
This list is based on CrossRef data as of 12 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
