Article published In: International Journal of Learner Corpus Research
Vol. 9:2 (2023) ► pp.248–269
Corpus report
The English Language Learner Insight, Proficiency and Skills Evaluation (ELLIPSE) Corpus
Published online: 8 February 2024
https://doi.org/10.1075/ijlcr.22026.cro
https://doi.org/10.1075/ijlcr.22026.cro
Abstract
This paper introduces the open-source English Language Learning Insight, Proficiency and Skills Evaluation
(ELLIPSE) corpus. The corpus comprises ~6,500 essays written by English language learners (ELLs). All essays were written during
state-wide standardized annual testing in the United States. The essays were written on 29 different independent prompts that
required no background knowledge on the part of the writer. Individual difference information is made available for each essay
including economic status, gender, grade level (8–12), and race/ethnicity. Each essay was scored by two trained human raters for
English language proficiency including an overall score of English proficiency and analytic scores for cohesion, syntax,
vocabulary, phraseology, grammar, and conventions. The paper provides reliability on the human judgments of proficiency reported
for the corpus. The ELLIPSE corpus addresses many of the concerns found in existing learner corpora including unique holistic and
analytic scores for each ELL essay. The corpus also includes limited demographic and individual difference data for each ELL.
Article outline
- 1.Introduction
- 1.2Measuring proficiency
- 2.The ELLIPSE Corpus
- 2.1Initial corpus
- 2.2Proficiency scoring
- 2.2Final ELLIPSE corpus
- 2.2.1Text statistics
- 2.2.2Meta-data
- 2.2.3Score distribution
- 3.Conclusion
- Open Material badge
- Notes
References
References (49)
Bachman, L. F., & Palmer, A. S. (1996). Language
testing in practice: Designing and developing useful language
tests (Vol. 11). Oxford University Press.
Bailey, A. L., & Kelly, K. R. (2010). The
use and validity of home language surveys in state English language proficiency assessment systems: A review and issues
perspective (Evaluating the Validity of English Language Proficiency Assessment). edCount, LLC Center of Assessment UCLA. [URL]
Birdsong, D. (2005). Interpreting
age effects in second language acquisition. In J. F. Kroll & A. M. B. de Groot (Eds.), Handbook
of bilingualism: Psycholinguistic
approaches (pp. 109–127). Oxford University Press.
Blanchard, D., Tetreault, J., Higgins, D., Cahill, A., & Chodorow, M. (2013). TOEFL11:
A corpus of non-native English. ETS Research Report Series 2013(2).
Boyd, A., Hana, J., Nicolas, L., Meurers, D., Wisniewski, K., Abel, A., Schöne, K., Štindlová, B., & Vettori, C. (2014). The
MERLIN corpus: Learner language and the CEFR. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings
of the Ninth International Conference on Language Resources and Evaluation
(LREC’14) (pp: 1281–1288). European Language Resources Association (ELRA).
Chapelle, C. A., Enright, M. K., & Jamieson, J. (Eds.) (2008). Building
a validity argument for the Test of English as a Foreign
Language. Routledge.
Cheng, W., & Warren, M. (2005). Peer
assessment of language proficiency. Language
Testing, 22(1), 93–121.
Choi, I. (2016). Efficacy
of an ICALL tutoring system and process-oriented corrective feedback. Computer Assisted
Language
Learning, 29(2), 334–364.
Chomsky, C. (1972). Stages
in Language Development and Reading Exposure. Harvard Educational
Review, 42(1), 1–33.
Clifford, R., & Cox, T. L. (2013). Empirical
validation of reading proficiency guidelines. Foreign Language
Annals, 46(1), 45–61.
Cohen, J. (1992). Statistical
Power Analysis. Current Directions in Psychological
Science, 1(3), 98–101.
Council of Europe (2001). Common European
Framework of Reference for Languages: Learning, teaching, assessment. Cambridge University Press.
Crossley, S. A., Kyle, K., & McNamara, D. S. (2016). The
tool for the automatic analysis of text cohesion (TAACO): Automatic assessment of local, global, and text
cohesion. Behavior Research
Methods, 48(4), 1227–1237.
(2017). Sentiment
Analysis and Social Cognition Engine (SEANCE): An automatic tool for sentiment, social cognition, and social-order
analysis. Behavior Research
Methods, 49(3), 803–821.
Crossley, S. A., & McNamara, D. S. (2010). Cohesion,
coherence, and expert evaluations of writing proficiency. In S. Ohlsson & R. Catrambone (Eds.), Proceedings
of the Annual Meeting of the Cognitive Science
Society (pp. 984–989). Cognitive Science Society.
Crossley, S., Salsbury, T., McNamara, D. S. (2013). Validating
lexical measures using human scores of lexical proficiency. In S. Jarvis & M. Daller (Eds.), Vocabulary
knowledge: Human ratings and automated
measures (pp. 105–134). John Benjamins.
Ellis, R. (1991). Grammatically
judgments and second language acquisition. Studies in Second Language
Acquisition, 13(2), 161–186.
Foddy, W. (1993). Constructing
questions for interviews and questionnaires: Theory and practice in social research. Cambridge University Press.
Geertzen, J., Alexopoulou, T., & Korhonen, A. (2013). Automatic
linguistic annotation of large scale L2 databases: The EF-Cambridge Open Language Database
(EFCAMDAT). In R. T. Miller, K. I. Martin, C. M. Eddington, A. Henery, N. Marcos Miguel, A. M. Tseng, A. Tuninetti, & D. Walter (Eds.), Proceedings
of the 31st Second Language Research Forum: Building Bridges Between
Disciplines (pp. 240–254). Cascadilla Proceedings Project.
Granena, G. (2019). Cognitive
aptitudes and L2 speaking proficiency: Links between LLAMA and Hi-LAB. Studies in Second
Language
Acquisition, 41(2), 313–336.
Graesser, A. C., McNamara, D. S., Louwerse, M. M., & Cai, Z. (2004). Coh-Metrix:
Analysis of text on cohesion and language. Behavior Research Methods, Instruments, &
Computers, 36(2), 193–202.
Housen, A., & Kuiken, F. (2009). Complexity,
accuracy, and fluency in second language acquisition. Applied
linguistics, 30(4), 461–473.
Housen, A., Kuiken, F., & Vedder, I. (Eds.). (2012). Dimensions
of L2 performance and proficiency: Complexity, accuracy and fluency in
SLA (Vol. 321). John Benjamins.
Ishikawa, S. I. (2013). The
ICNALE and sophisticated contrastive interlanguage analysis of Asian learners of
English. Learner Corpus Studies in Asia and the
World, 1(1), 91–118.
Kim, A. Y. (2015). Exploring
ways to provide diagnostic feedback with an ESL placement test: Cognitive diagnostic assessment of L2 reading
ability. Language
Testing, 32(2), 227–258.
Kyle, K., & Crossley, S. A. (2018). Measuring
Syntactic Complexity in L2 Writing Using Fine-Grained Clausal and Phrasal Indices. The Modern
Language
Journal, 102(2), 333–349.
Kyle, K., Crossley, S., & Berger, C. (2018). The
tool for the automatic analysis of lexical sophistication (TAALES): Version 2.0. Behavior
Research
Methods, 50(3), 1030–1046.
Kyle, K., Crossley, S. A., & Jarvis, S. (2021). Assessing
the Validity of Lexical Diversity Indices Using Direct Judgements. Language Assessment
Quarterly, 18(2), 154–170.
Lagakis, P., & Demetriadis, S. (2021). Automated
essay scoring: A review of the field. 2021 International Conference on Computer, Information
and Telecommunication Systems (CITS), 1–6.
Laufer, B., & Nation, P. (1999). A
vocabulary-size test of controlled productive ability. Language
Testing, 16(1), 33–51.
Lim, G. S. (2011). The
development and maintenance of rating quality in performance writing assessment: A longitudinal study of new and experienced
raters. Language
Testing, 28(4), 543–560.
Linacre, J. M. (2021). A
User’s Guide to FACETS Rasch-Model Computer Programs. Program
Manual 3.83.5.
Lisken-Gasparro, J. E. (1984). The
ACTFL proficiency guidelines: Gateway to testing and curriculum. Foreign Language
Annals 17(5), 475–489.
Lumley, T. (1998). Perceptions
of language-trained raters and occupational experts in a test of occupational English language
profficiency. English for Specific
Purposes, 17(4), 347–367.
Meurers, D., De Kuthy, K., Nuxoll, F., Rudzewitz, B., & Ziai, R. (2019). Scaling
up intervention studies to investigate real-life foreign language learning in school. Annual
Review of Applied
Linguistics, 391, 161–188.
McNamara, T., Knoch, U., Fan, J., & Rossner, R. (2019). Fairness,
justice & language assessment. Oxford University Press.
Ortega, L. (2012). Epilogue:
Exploring L2 writing–SLA interfaces. Journal of Second Language
Writing, 21(4), 404–415.
O’Sullivan, B. (2018). IELTS
(international English language testing system). In J. I. Liontas (Ed.
in Chief), The TESOL Encyclopedia of English Language
Teaching (pp. 1–8). Wiley.
Plonsky, L. (2023). Sampling
and Generalizability in Lx Research: A Second-Order
Synthesis. Languages 8(1), 751, 1–13.
U.S. Department of Education. (2017). Our
nation’s English learners. US Department of Education. [URL]
Cited by (10)
Cited by ten other publications
Leşeanu, Anda, İbrahim Rıza Hallaç, Burçin Buket Oğul & Hasan Oğul
Chu, Seong Yeub, Jong Woo Kim & Mun Yong Yi
Crossley, Scott A., Perpetual Baffour, L. Burleigh & Jules King
Gürel, Sungur, Murat Şahin, İbrahim Uysal, Ali İbileme & Tuba Gündüz
Oğuz, Enis
Thwaites, Peter, Nathan Vandeweerd & Magali Paquot
Yamashita, Taichi
Zambrano, Andres Felipe, Shreya Singhal, Maciej Pankiewicz, Ryan Shaun Baker, Chelsea Porter & Xiner Liu
Mahmoud, Somaia, Emad Nabil & Marwan Torki
This list is based on CrossRef data as of 12 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
