Article published In: International Journal of Corpus Linguistics
Vol. 19:4 (2014) ► pp.443–477
Corpus frequency and second language learners’ knowledge of collocations
A meta-analysis
Published online: 25 October 2014
https://doi.org/10.1075/ijcl.19.4.01dur
https://doi.org/10.1075/ijcl.19.4.01dur
Tests of second language learners’ knowledge of collocation have lacked a principled strategy for item selection, making claims about learners’ knowledge beyond the particular collocations tested difficult to evaluate. Corpus frequency may offer a good basis for item selection, if a reliable relationship can be demonstrated between frequency and learner knowledge. However, such a relationship is difficult to establish satisfactorily, given the small number of items and narrow range of test-takers involved in any individual study. In this study, a meta-analysis is used to determine the correlation between learner knowledge and frequency data across nineteen previously-reported tests. Frequency is shown to correlate moderately with knowledge, but the strength of this correlation varies widely across corpora. Strength of association measures (such as mutual information) do not to correlate with learner knowledge. These findings are discussed in terms of their implications for collocation testing and models of collocation learning.
Keywords: vocabulary, SLA, collocation, frequency, formulaic language, testing
References (55)
Abdul-Fattah, H.S. 2001. “Collocation: A missing chain from Jordanian basic education stage English language curriculum and pedagogy”. Dirasat, Humand and Social Sciences, 28 (2), 582–596.
Barfield, A. 2003. Collocation Recognition and Production: Research Insights. Tokyo: Chuo University.
Barfield, A. & Gyllstad, H. 2009. “Introduction: Researching L2 collocation knowledge and developmenst”. In A. Barfield & H. Gyllstad (Eds.), Researching Collocations in Another Language. Basingstoke: Palgrave Macmillan, 1–18.
Biber, D. 2009. “A corpus-driven approach to formulaic language in English: Multi-word patterns in speech and writing”. International Journal of Corpus Linguistics, 14 (3), 275–311.
Biber, D., Johansson, S., Leech, G., Conrad, S. & Finegan, E. 1999. Longman Grammar of Spoken and Written English. Harlow: Longman.
Bonk, W.J. 2001. “Testing ESL learners knowledge of collocations”. In T. Hudson & J.D. Brown (Eds.), A Focus on Language Test Development. Honolulu: University of Hawaii Press, 113–142.
Brashi, A. 2009. “Collocability as a problem in L2 production”. Reflections in English Language Teaching, 8 (1), 21–34.
Cheng, W., Greaves, C. & Warren, M. 2006. “From n-gram to skipgram to concgram”. International Journal of Corpus Linguistics, 11 (4), 411–433.
Clear, J. 1993. “From Firth principles: Computational tools for the study of collocations”. In M. Baker, G. Francis & E. Tognini-Bonelli (Eds.), Text and Technology: in Honour of John Sinclair. Amsterdam: John Benjamins, 271–292.
Cowie, A.P. (Ed.) 1998. Phraseology: Theory, Analysis, and Applications. Oxford: Oxford University Press.
Davies, M. 2004-. BYU-BNC (Based on the British National Corpus from Oxford University Press). Available at [URL] (accessed July 2014).
. 2008-. The Corpus of Contemporary American: 450 million words, 1990-present. Available at: [URL] (accessed July 2014).
Durrant, P. 2008. High-frequency Collocations and Second Language Learning. Unpublished PhD thesis, University of Nottingham, Nottingham.
Durrant, P. & Doherty, A. 2010. “Are high-frequency collocations psychologically real? Investigating the thesis of collocational priming”. Corpus Linguistics and Linguistic Theory, 6 (2), 125–155.
Durrant, P. & Mathews-Aydinli, J. 2011. “A function-first approach to identifying formulaic language in academic writing”. Journal of English for Specific Purposes, 30 (1), 58–72.
Durrant, P. & Schmitt, N. 2009. “To what extent do native and non-native writers make use of collocations?”. International Review of Applied Linguistics, 47 (2), 157–177.
. 2010. “Adult learners’ retention of collocations from exposure”. Second Language Research, 26 (2), 163–188.
Ellis, N.C. 2001. “Memory for language”. In P. Robinson (Ed.), Cognition and Second Language Instruction. Cambridge: Cambridge University Press, 33–68.
Ellis, N.C. & Larsen-Freeman, D. 2006. “Language emergence: Implications for applied linguistics – Introduction to the Special Issue”. Applied Linguistics, 27 (4), 558–589.
Ellis, N.C., Simpson-Vlach, R. & Maynard, C. 2008. “Formulaic language in native and second-language speakers: Psycholinguistics, corpus linguistics, and TESOL”. TESOL Quarterly, 41 (3), 375–396.
Gries, S.T. 2008. “Dispersions and adjusted frequencies in corpora”. International Journal of Corpus Linguistics, 13 (4), 403–437.
Gyllstad, H. 2007. Testing English Collocations: Developing Receptive Tests for Use with Advanced Swedish Learners. Lund: Lund University.
Halliday, M.A.K. 1966. “Lexis as a linguistic level”. In C.E. Bazell, J.C. Catford, M.A.K. Halliday & R.H. Robins (Eds.), In Memory of J. R. Firth. London: Longmans, Green and Co. Ltd., 148–162.
Jaén, M.M. 2009. Recopilación, Desarrollo Pedagógico y Evaluación de un Banco de Colocaciones Frecuentes de la Lengua Inglesa a Través de la Lingüística de Corpus y Computacional. Unpublished PhD thesis. Universidad de Granada, Granada.
Jones, S. & Sinclair, J.M. 1974. “English lexical collocations. A study in computational linguistics”. Cahiers de Lexicologie, 24 (2), 15–61.
Kjellmer, G. 1990. “A mint of phrases”. In K. Aijmer & B. Altenberg (Eds.), English Corpus Linguistics: Studies in Honour of Jan Svartvik. London: Longman, 111–127.
Kurosaki, S. 2012. An Analysis of the Knowledge and Use of English Collocations by French and Japanese Learners. Unpublished PhD thesis. University of London Institute in Paris, Paris.
Manning, C.D. & Schütze, H. 1999. Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press.
Nattinger, J.R. & DeCarrico, J.S. 1992. Lexical Phrases and Language Teaching. Oxford: Oxford University Press.
Nesselhauf, N. 2004. “What are collocations?”. In D.J. Allerton, N. Nesselhauf & P. Skandera (Eds.), Phraseological Units: Basic Concepts and their Application. Basel: Schwabe, 1–21.
Norris, J.M. & Ortega, L. 2006. “The value and practice of research synthesis for language learning and teaching”. In J.M. Norris & L. Ortega (Eds.). Synthesizing Research on Language Learning and Teaching. Amsterdam: John Benjamins, 3–50.
Pawley, A. & Syder, F.H. 1983. “Two puzzles for linguistic theory: Nativelike selection and nativelike fluency”. In J.C. Richards & R.W. Schmidt (Eds.), Language and Communication. New York: Longman, 191–226.
Revier, R.L. 2009. “Evaluating a new test of whole English collocations”. In A. Barfield & H. Gyllstad (Eds.), Researching Collocations in another Language. Basingstoke: Palgrave Macmillan, 125–138.
Schmitt, N. 2010. Researching Vocabulary: A Vocabulary Research Manual. Basingstoke: Palgrave Macmillan.
Schmitt, N. & Zimmerman, C.B. 2002. “Derivative word forms: What do learners know?”. TESOL Quarterly, 36 (2), 145–171.
Shin, D. & Nation, P. 2008. “Beyond single words: The most frequent collocations in spoken English”. ELT Journal, 62 (4), 339–348.
. 2004. “The search for units of meaning”. In J.M. Sinclair, Trust the Text: Language, Corpus and Discourse. London: Routledge, 24–48.
Siyanova-Chanturia, A., Conklin, K. & van Heuven, W.J.B. 2011. “Seeing a phrase ‘time and again’ matters: The role of phrasal frequency in the processing of multiword sequences”. Journal of Experimental Psychology: Learning, Memory, and Cognition, 37 (3), 776–784.
Webb, S. & Kagimoto, E. 2011. “Learning collocations: Do the number of collocates, position of the node word, and synonymy affect learning?”. Applied Linguistics, 32 (3), 259–276.
Wolter, B. & Gyllstad, H. 2011. “Collocational links in the L2 mental lexicon and the influence of L1 intralexical knowledge”. Applied Linguistics, 32 (4), 430–449.
Cited by (44)
Cited by 44 other publications
Bottini, Raffaella & Elen Le Foll
2025. The more proficient the learners, the less sophisticated their L2 vocabulary?. International Journal of Learner Corpus Research 11:1 ► pp. 47 ff.
Brown, Dale
Forti, Luciana
2025. Data-driven learning effects on the development of Italian L2 phraseological competence. In Applying Corpora in Teaching and Learning Romance Languages [Studies in Corpus Linguistics, 122], ► pp. 180 ff.
Li, Ni, Brent Wolter, Lianrui Yang & Anna Siyanova-Chanturia
Lu, Yuan
Naismith, Ben & Alan Juffs
Sahraei, Rezamorad, Zahra Haghighi Naseri & Mehrdad Vasheghani Farahani
Yunjung (Yunie), Ku
Fan, Kunmeng & Haixiao Wang
Hashizaki, Ryotaro
Vuogan, Alyssa & Shaofeng Li
Boone, Griet, Vanessa De Wilde & June Eyckmans
Boone, Griet & June Eyckmans
Cao, Dung & Richard Badger
Fakir, Abdelali El & Hind Brigui
Ferraresi, Adriano & Silvia Bernardini
2023. Comparing collocations in translated and learner language. International Journal of Learner Corpus Research 9:1 ► pp. 125 ff.
Shi, Jinfang, Gang Peng & Dechao Li
Sologuren, Enrique
Gries, Stefan Th.
2022. What do (some of) our association measures measure (most)? Association?. Journal of Second Language Studies 5:1 ► pp. 1 ff.
HADİZADEH, Abbas & Sonia JAHANGİRİAN
Uchihara, Takumi, Masaki Eguchi, Jon Clenton, Kristopher Kyle & Kazuya Saito
Danilina, Svetlana
Öksüz, Doğuş, Vaclav Brezina & Patrick Rebuschat
Gries, Stefan Th. & Philip Durrant
Lindstromberg, Seth & June Eyckmans
2020. The effect of frequency on learners’ ability to recall the forms of deliberately learned L2 multiword expressions. ITL - International Journal of Applied Linguistics 171:1 ► pp. 2 ff.
Mizumoto, Atsushi, Luke Plonsky & Jesse Egbert
Nevalainen, Terttu, Tanja Säily, Turo Vartiainen, Aatu Liimatta & Jefrey Lijffijt
Puimège, Eva & Elke Peters
Yao, Xinyue
Chen, Wang
Durrant, Philip, Joseph Moxley & Lee McCallum
2019. Vocabulary sophistication in First-Year Composition assignments. International Journal of Corpus Linguistics 24:1 ► pp. 33 ff.
Rastelli, Stefano
Schmitt, Norbert, Suhad Sonbul, Laura Vilkaitė‐Lozdienė & Marijana Macis
Uchihara, Takumi, Stuart Webb & Akifumi Yanagisawa
García Salido, Marcos & Marcos Garcia
Shin, Yu Kyoung, Viviana Cortes & Isaiah WonHo Yoo
Wolter, Brent & Junko Yamashita
Yamashita, Junko
2018. Possibility of semantic involvement in the L1-L2 congruency effect in the processing of L2 collocations. Journal of Second Language Studies 1:1 ► pp. 60 ff.
Macis, Marijana & Norbert Schmitt
Paquot, Magali & Luke Plonsky
2017. Quantitative research methods and study quality in learner corpus research. International Journal of Learner Corpus Research 3:1 ► pp. 61 ff.
Yoon, Hyung-Jo
[no author supplied]
This list is based on CrossRef data as of 12 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
