In:Vocabulary Knowledge: Human ratings and automated measures
Edited by Scott Jarvis and Michael Daller
[Studies in Bilingualism 47] 2013
► pp. 105–134
Chapter 4. Validating lexical measures using human scores of lexical proficiency
Published online: 14 August 2013
https://doi.org/10.1075/sibil.47.06ch4
https://doi.org/10.1075/sibil.47.06ch4
This study examines the convergent validity of a wide range of computational indices reported by Coh-Metrix that have been associated in past studies with lexical features such as basic category words, semantic co-referentiality, word frequency, and lexical diversity. This study uses human judgments of these lexical features as found in free-writing samples as operationalizations of the lexical constructs the indices are meant to measure. Statistical analyses were then conducted to examine the convergent validity of each index and to assess the predictive ability of the indices that correlate strongest with the human judgments to explain holistic scores of lexical proficiency in L1 and L2 speakers. Correlations between the automated lexical indices and the operationalized constructs demonstrated small to large effect sizes providing a degree of convergent validity for most of the automated indices examined in this study. A multiple regression predicting holistic judgments of lexical proficiency using these automated lexical indices explained 40% of the variance in a training set and 37% of the variance in a test set. The findings from the study provide a degree of confidence that the indices are measuring the constructs they were predicted to measure.
Cited by (9)
Cited by nine other publications
Jiang, Yuyu & Hua Chen
López-Solà, Inmaculada & Fernando Lillo-Fuentes
Naismith, Ben & Alan Juffs
Paquot, Magali & Hubert Naets
2025. Phraseological sophistication as a multidimensional construct. International Journal of Learner Corpus Research 11:1 ► pp. 217 ff.
Crossley, Scott, Yu Tian, Perpetual Baffour, Alex Franklin, Youngmeen Kim, Wesley Morris, Meg Benner, Aigner Picou & Ulrich Boser
2023. The English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus. International Journal of Learner Corpus Research 9:2 ► pp. 248 ff.
Hržica, Gordana & Maja Roch
2021. Lexical diversity in bilingual speakers of Croatian and Italian. In Language Impairment in Multilingual Settings [Trends in Language Acquisition Research, 29], ► pp. 99 ff.
Gharibi, Khadijeh & Frank Boers
Sosa, Ricardo
This list is based on CrossRef data as of 1 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
