A tool to analyse learner writing and better understand the challenges of language education: Chapter 9. Scoledit

Wolfarth, Claire; Ponton, Claude; Brissaud, Catherine

doi:10.1075/scl.102.09wol

In:Beyond Concordance Lines: Corpora in language education
Edited by Pascual Pérez-Paredes and Geraldine Mark
[Studies in Corpus Linguistics 102] 2021
► pp. 207–230

Get fulltext from our e-platform

Download Book PDF

Download Book EPUB

Chapter 9
Scoledit

A tool to analyse learner writing and better understand the challenges of language education

Claire Wolfarth | Grenoble Alpes University, Lidilem, Grenoble

Claude Ponton | Grenoble Alpes University, Lidilem, Grenoble

Catherine Brissaud | Grenoble Alpes University, Lidilem, Grenoble

Published online: 22 December 2021

https://doi.org/10.1075/scl.102.09wol

Abstract

The purpose of Scoledit is to build a computer-aided longitudinal corpus of texts written by pupils between 6 and 11 years as well as associated automatic processing tools. This project seeks to produce linguistic descriptions of pupils’ writings and to facilitate the teaching of spelling and writing. Currently, an increasing number of projects aim to create large primary school corpora of French (Elalouf, 2005; Garcia-Debanc & Bonnemaison, 2014; David & Doquet, 2016). However, these corpora are neither longitudinal nor associated with natural language processing (NLP) tools (Wolfarth, 2017). This chapter discusses some of the automated tools for linguistic analyses developed and the advantages of the Scoledit project in the context of language teaching

Keywords: first language learner corpora, natural language processing tools, linguistic description of writing skills, Scoledit

Article outline

Context
The Scoledit project
Corpus design
Specific tools for processing
Description of the longitudinal corpus
Grammatical categories
Breakdown of error categories
Breakdown of errors by grammatical category
Observation of verbal morphology
Breakdown of verb tenses
Error breakdown
Distinction of errors in the stem and the inflection
Teaching recommendations on verbal tenses
Hyposegmentation and hypersegmentation
Elision, a frequent factor in hyposegmentation
Hyposegmentation: The case of reflexive verbs
A particular hyposegmentation issue: The alternation of la/‘l’a’
Teaching recommendations on word segmentation
Conclusion
Notes
References

References (24)

References

Banerji, N., Gupta, V., Kilgarriff, A., & Tugwell, D. (2013). Oxford children’s corpus : A corpus of children’s writing, reading, and education. Corpus Linguistics 2013, 315–317.

Berkling, K. (2016). Corpus for children’s writing with enhanced output for specific spelling patterns (2nd and 3rd Grade). In N. Calzolari, K. Choukri, T. Declerck, S. Goggi … S. Piperidis (Eds.), Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (pp. 3200–3206). European Language Resources Association (ELRA).

(2018). A 2nd longitudinal corpus for children’s writing with enhanced output for specific spelling patterns. In N. Calzolari, K. Choukri, C. Cieri, T. Declerck … T. Tokunaga (Eds.), Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC-2018) (pp. 2262–2268). European Language Resources Association (ELRA).

Boré, C., & Elalouf, M.-L. (2017). Deux étapes dans la construction de corpus scolaires: Problèmes récurrents et perspectives nouvelles. Corpus, 16, 31–64.

Brissaud, C., & Chevrot, J.-P. (2011). The late acquisition of a major difficulty of French inflectional orthography: The homophonic /E/ verbal endings. Writing Systems Research, 3(2), 129–144.

Catach, N. (1980). L’orthographe française: Traité théorique et pratique avec des travaux d’application et leurs corrigés (Vol. 3). Nathan.

Chipere, N., Malvern, D., & Richards, B. (2004). Using a corpus of children’s writing to test a solution to the sample size problem affecting type-token ratios. In G. Aston, S. Bernardini, & D. Stewart (Eds.), Corpora and language learners (pp. 139–147). John Benjamins.

Clanché, P. (1988). L’enfant écrivain: Génétique et symbolique du texte libre. Paidos Le Centurion. Persée. Retrieved from [URL]

De Vogüé, S., Espinoza, N., Garcia, B., Perini, M., & Marzena Watorek, F. (2017). Constitution d’un grand corpus d’écrits émergents et novices: Principes et méthodes. Corpus, 16, 65–86.

Doquet, C., Enoiu, V., Fleury, S., & Maziotti, S. (2017). Problèmes posés par la transcription et l’annotation d’écrits d’élèves. Corpus, 16, 133–156.

Elalouf, M.-L. (2005). Écrire entre 10 et 14 ans un corpus, des analyses, des repères pour la formation. Canopé – CRDP de Versailles.

Fabre, C. (1990). Les brouillons d’écoliers ou l’entrée dans l’écriture. Ceditel.

Garcia-Debanc, C., & Bonnemaison, K. (2014). La gestion de la cohésion textuelle par des élèves de 11–12 ans: Réussites et difficultés. Actes du 4e Congrès Mondial de Linguistique Française (CMLF 2014), Juillet 2014, 8, 961–976.

Gendner, V., & Adda-Decker, M. (2002). Analyse comparative de corpus oraux et écrits français: Mots, lemmes et classes morpho-syntaxiques. Actes des XIVes Journées d’Etude sur la Parole, Nancy.

Juel, C. (1988). Learning to read and write: A longitudinal study of 54 children from first through fourth grades. Journal of Educational Psychology, 80(4), 437–447.

Lavalley, R., Berkling, K., & Stüker, S. (2015). Preparing children’s writing database for automated processing. LTLT@ SLaTE, 9–15.

Lété, B., Sprenger-Charolles, L., & Colé, P. (2004). MANULEX : A grade-level lexical database from French elementary school readers. Behavior Research Methods, Instruments, & Computers, 36, 156–166.

Penloup, M.-C. (2001). De quelques propriétés d’une pratique de lecture extrascolaire: Le courrier des lecteurs du journal Astrapi. Repères. Recherches en Didactique du Français Langue Maternelle, 23(1), 75–91.

Savelli, M., Brissaud, C., Chevrot, J.-P., & Gounon, V. (2002). L’apprentissage d’un temps peu enseigné: Le passé simple. LeFfrancais Aujourd’hui, 4, 39–48.

Schmid, H. (1994). Probabilistic part-of-speech tagging using decision trees. In Proceedings of the International Conference on New Methods in Language Processing. Manchester, UK (pp. 44–49).

Smith, N., McEnery, T., & Ivanic, R. (1998). Issues in transcribing a corpus of children’s handwritten projects. Literary and Linguistic Computing, 13(4), 217–225.

Wolfarth, C., Brissaud, C., & Ponton, C. (2018). Transcrire et normer un corpus scolairep: Pour quelles analyses ? In C. Brissaud, M. Dreyfus, & B. Kervyn (Eds.), Repenser l’écriture et son évaluation au primaire et au secondaire (p. 121–146). Presses universitaires de Namur.

Wolfarth, C., Ponton, C., & Brissaud, C. (2018). Gestion de la morphographie verbale en production d’écrits : Que peut nous apprendre un corpus longitudinal ? Repères. Recherches en Didactique du Français Langue Maternelle, 57, 209–226.

Wolfarth, C., Ponton, C., & Totereau, C. (2017). Apports du TAL à la constitution et à l’exploitation d’un corpus scolaire. Corpus, 16.

Chapter 9Scoledit

A tool to analyse learner writing and better understand the challenges of language education

Chapter 9
Scoledit