In:Applying Corpora in Teaching and Learning Romance Languages
Edited by Henry Tyne and Stefania Spina
[Studies in Corpus Linguistics 122] 2025
► pp. 40–64
Chapter 2Spoken French corpora and listening comprehension
A brief historical and critical survey
Published online: 20 November 2025
https://doi.org/10.1075/scl.122.02sur
https://doi.org/10.1075/scl.122.02sur
Abstract
Since the invention of the phonograph in 1877, technology has enabled recording, and theoretically
could have provided authentic spoken data to practice listening comprehension in L2 teaching. However, this has not
really happened to the extent that could have been expected, despite the emphasis on oral communication in recent
approaches to L2 teaching and the development of spoken corpora. Indeed, advocates of data-driven learning (DDL) often
lament that “a lot remains to be done.” But what has been done in French DDL since the pioneering spoken French corpus
Le français fondamental was compiled in the 1950s? This chapter addresses some of the challenges
raised by listening comprehension, DDL, and the often-cited idea of the “learner-as-researcher”.
Article outline
- 1.Introduction
- 2.The challenge of understanding daily spoken French
- 3.Which data for L2 language teaching?
- 4.The importance of listening comprehension in L2
- 4.1Achieving listening fluency: The challenges
- 5.“A lot still remains to be done”
- 6.How can spoken language corpora help?
- 6.1The written language bias in corpora
- 6.2From “real language” to corpora
- 7.Using spoken French corpora in L2 teaching: A brief six-decade overview
- 7.1The indirect use of spoken corpora
- 7.2Le français fondamental
- 7.2.1Description
- 7.2.2Pedagogical applications
- 7.2.3Discussion
- 7.3L’enquête socio-linguistique d’Orléans
- 7.3.1Description
- 7.3.2Pedagogical applications
- 7.3.3Discussion
- 7.4The direct use of online spoken corpora
- 7.4.1A brief overview of the challenges of the direct use of language corpora
- 7.5FLORALE (Français langue orale pour le FLE)
- 7.5.1Description
- 7.5.2Discussion
- 8.Conclusion
Notes References
References (93)
André, Virginie. 2016. FLEURON:
Français langue étrangère universitaire — Ressources et outils numériques. Origine, démarches et
perspectives. Mélanges
CRAPEL 37: 69–92.
Bailey, Kathleen M. 2020. Teaching Listening and
Speaking in Second and Foreign Language
Contexts. London: Bloomsbury Academic.
. 2011. Normalisation
revisited. The effective use of technology in language education. International
Journal of Computer-assisted Language Learning and
Teaching 1(2): 1–15.
Béguelin, Marie-José. 2012. Le
statut de l’écriture. In Claire Blanche-Benveniste.
La linguistique à l’école de l’oral, Ruggero Druetta (ed.), 39–53. Sylvains les Moulins: Gerflint.
Bergounioux, Gabriel, Baraduc, Jean & Dumont, Céline. 1992. L’Etude
socio-linguistique sur Orléans (1966–1991): 25 ans d’histoire d’un
corpus. Langue
française 93: 74–93.
Blanc, Michel & Biggs, Patricia. 1971. L’enquête
socio-linguistique sur le français parlé à Orléans. Le français dans le
monde 85: 16–25.
Bogaards, Paul. 1994. Le
vocabulaire dans l’apprentissage des langues
étrangères. Paris: Hatier/Didier.
Boulton, Alex & Tyne, Henry. 2013. Corpus
linguistics and data-driven learning: A critical overview. Bulletin suisse de
linguistique
appliquée 97: 97–118.
Boulton, Alex & Cobb, Tom. 2017. Corpus
use in language learning: A Meta-Analysis. Language
Learning 67–2: 348–393.
Brown, H. Douglas. 2014. Principles of
Language Learning and Teaching. White Plains NY: Pearson Education.
Brown, H. Douglas & Lee, Heekyeong. 1994. Teaching
by Principles: An Interactive Approach to Language Pedagogy. White Plains NY: Pearson Education.
Brugman, Hennie & Russel, Albert. 2004. Annotating
multi-media/multi-modal resources with
ELAN. In Proceedings of the Fourth International
Conference on Language Resources and Evaluation (LREC 2004), Maria Teresa Lino, Maria Francisca Xavier, Fátima Ferreira, Rute Costa & Raquel Silva (eds), 1–4. Paris: ELRA. [URL]
Chambers, Andrea & Bax, Stephen. 2006. Making
CALL work: Towards
normalisation. System 34(4): 465–479.
Chambers, Angela. 2009. Les
corpus oraux en français langue étrangère: Authenticité et pédagogie. Mélanges
CRAPEL 31: 15–33.
. 2019. Towards
the corpus revolution? Bridging the research–practice gap. Language
Teaching 52(4): 460–475.
Chang, Anna, Millett, Sonia & Renandya, Willy Ardian. 2018. Developing
listening fluency through supported extensive listening practice. RELC
Journal 50(3): 422–438.
Cobb, Tom. 2014. A
resource wish-list for data-driven learning in
French. In French through Corpora: Ecological and
Data-driven Perspectives in French Language Studies, Henry Tyne, Virginie André, Christophe Benzitoun, Alex Boulton & Yan Greub (eds), 255–290. Newcastle upon Tyne: Cambridge Scholars.
CREDIF (Centre de recherche et d’étude pour la diffusion du
français). 1960. Voix et images de France. Livre de
l’élève. Paris: Didier.
Cutler, Anne. 2012. Native
Listening: Language Experience and the Recognition of Spoken Words. Cambridge MA: The MIT Press.
Decoo, Wilfried. 2011. Systemization
in Foreign Language Teaching. Monitoring Content Progression. New York NY: Routledge.
Delahaie, Juliette. 2013. Constitution
et exploitation de corpus d’interactions verbales pour le FLE: Problèmes et
programme. Linx 68–69: 95–114.
Detey, Sylvain, Lyche, Chantal, Tchobanov, Atanas, Durand, Jacques & Laks, Bernard. 2009. Ressources
phonologiques au service de la didactique de l’oral: Le projet PFC–EF. Mélanges
CRAPEL 31: 223–236.
Du Bois, John W. 1991. Transcription
design principles for spoken discourse
research. Pragmatics 1(1): 71–106.
Durán, Richard & McCool, George. 2003. If
this is French, then what did I learn in school? The French
Review 77(2): 288–299.
Étienne, Corinne & Sax, Kelly. 2009. Stylistic
variation in French: Bridging the gap between research and textbooks. The
Modern Language
Journal 93(4): 584–606.
Field, John. 2004. An
insight into listeners’ problems: Too much bottom-up or too much
top-down? System 32: 363–377.
. 2008b. The
changing face of listening. In Methodology in
Language Teaching. An Anthology of Current Practice, Jack C. Richards & Willy A. Renandya (eds), 242–247. Cambridge: CUP.
. 2019. Second
language listening: Current ideas, current
issues. In The Cambridge Handbook of Language
Learning, John W. Schwieter & Alessandro G. Benati (eds), 283–319. Cambridge: CUP.
Frankenberg-Garcia, Ana. 2012. Raising
teachers’ awareness of corpora. Language
Teaching 45(4): 475–489.
Giroud, Anick & Surcouf, Christian. 2016. De
“Pierre, combien de membres avez-vous?” à “Nous nous appelons Marc et Christian”: Réflexions autour de
l’authenticité dans les documents oraux des manuels de FLE pour
débutants. In 5e congrès mondial de linguistique
française [SHS Web of Conferences Vol. 27].
Gougenheim, Georges. 1955. Le
français élémentaire. Étude sur une langue de base. International Review of
Education 1(4): 401–412.
Gougenheim, Georges, Michéa, René, Rivenc, Paul & Sauvageot, Aurélien. 1964. L’élaboration
du français fondamental: Étude sur l’établissement d’un vocabulaire et d’une grammaire de
base. Paris: Didier.
Institut pédagogique
national & Marie André. 1954. Le
français élémentaire. Paris: Ministère de l’Éducation Nationale.
Johns, Tim. 1988. Whence
and whither classroom concordancing. In Computer
Applications in Language Learning, Theo Bongaerts, Pieter de Haan, Sylvia Lobbe & Herman Wekker (eds), 9–33. Berlin: De Gruyter.
Kern, Richard. 2011. Technology
and language learning. In The Routledge Handbook of
Applied Linguistics, James Simpson (ed.), 200–214. London: Routledge.
Leńko-Szymańska, Agnieszka & Boulton, Alex. 2015. Introduction.
Data-driven learning in language
pedagogy. In Multiple Affordances of Language Corpora
for Data-driven Learning [Studies in Corpus Linguistics
69], Agnieszka Leńko-Szymańska & Alex Boulton (eds), 1–14. Amsterdam: John Benjamins.
Levelt, Willem J. M. 1994. The skill of
speaking. In International Perspectives on
Psychological
Science, Vol. I: Leading
Themes, Paul Bertelson, Paul Elen & Gery d’Ydewalle (eds), 89–104. Hillsdale NJ: Lawrence Erlbaum Associates.
Levy, Mike & Caws, Catherine. 2016. CALL
design and research. Taking a micro and macro
view. In Language-learner Computer Interactions.
Theory, Methodology and CALL Applications [Language Studies, Science and Engineering
2], Catherine Caws & Marie-Josée Hamel (eds), 89–113. Amsterdam: John Benjamins.
Linell, Per. 2005. The
Written Language Bias in Linguistics: Its Nature, Origins, and
Transformations. New York NY: Routledge.
Love, Robbie, Dembry, Claire, Hardie, Andrew, Brezina, Vaclav & McEnery, Tony. 2017. The
spoken BNC2014. International Journal of Corpus
Linguistics 22(3): 319–344.
Lüdeling, Anke & Kytö, Merja. 2008. Introduction. In Corpus
Linguistics. An International
Handbook, Vol. 1, Anke Lüdeling & Merja Kytö (eds), v–xii. Berlin: Walter de Gruyter.
Mauger, Gaston & Gougenheim, Georges. 1955. Le
français élémentaire. Méthode progressive de français usuel. Débutants 1er
livret. Paris: Hachette.
Mauranen, Anna. 2004. Speech
corpora in the classroom. In Corpora and Language
Learners [Studies in Corpus Linguistics 17], Guy Aston, Sylvia Bernardini & Dominick Stewart (eds), 195–211. Amsterdam: John Benjamins.
McCarthy, Michael & O’Keeffe, Anne. 2010. Historical
perspective. What are corpora and how have they
evolved? In The Routledge Handbook of Corpus
Linguistics, Anne O’Keeffe & Michael McCarthy (eds), 3–13. London: Routledge.
McCarthy, Michael, McCarten, Jeanne & Sandiford, Helen. 2005. Touchstone
Student’s
Book. Cambridge: CUP.
Michéa, René. 1953. Mots
fréquents et mots disponibles. Un aspect nouveau de la statistique du
langage. Les langues
modernes 47(4): 338–344.
Morley, Joan. 2001. Aural
comprehension instruction: Principles and
practices. In Teaching English as a Second or Foreign
Language, Marianne Celce-Murcia (ed.), 69–85. Boston MA: Heinle & Heinle.
Nation, I. S. P. & Newton, Jonathan. 2009. Teaching
ESL/EFL Listening and
Speaking. London: Routledge.
Nunan, David. 2008. Listening
in language learning. In Methodology in Language
Teaching. An Anthology of Current Practice, Jack C. Richards & Willy A. Renandya (eds), 238–241. Cambridge: CUP.
O’Keeffe, Anne. 2021. Data-driven
learning — A call for a broader research gaze. Language
Teaching 54(2): 259–272.
Oostdijk, Nelleke & Boves, Lou. 2008. Preprocessing
speech corpora: Transcription and phonological
annotation. In Corpus Linguistics. An International
Handbook, Vol. 1, Anke Lüdeling & Merja Kytö (eds), 642–663. Berlin: Walter de Gruyter.
Pérez-Paredes, Pascual. 2010. Corpus
linguistics and language education in perspective: Appropriation and the possibilities
scenario. In Corpus Linguistics in Language
Teaching, Tony Harris & María Moreno Jaén (eds), 53–73. Bern: Peter Lang.
. 2019. The
pedagogic advantage of teenage corpora for secondary school
learners. In Data-driven Learning for the Next
Generation: Corpora and DDL for Pre-tertiary Learners, Peter Crosthwaite (ed.), 67–87. London: Routledge.
Ravazzolo, Elisa & Etienne, Carole. 2019. Nouvelles
ressources pour le FLE à partir des études en
interaction. Linx 79.
Rivenc, Paul. 2006. Les
auteurs du Français fondamental face à un objet nouveau et insolite: l’Interaction
orale. Documents pour l’histoire du français langue étrangère ou
seconde 36.
Römer, Ute. 2006. Pedagogical
applications of corpora: Some reflections on the current scope and a wish list for future
developments. Zeitschrift für Anglistik und
Amerikanistik 54(2): 121–134.
. 2008. Corpora
and language teaching. In Corpus Linguistics. An
International
Handbook, Vol. 1, Anke Lüdeling & Merja Kytö (eds), 112–131. Berlin: De Gruyter.
Ruhi, S̨ükriye, Schmidt, Thomas, Wörner, Kai & Haugh, Michael. 2014. Introduction:
Putting practices in spoken corpora into
focus. In Best Practices for Spoken Corpora in
Linguistic Research, S̨ükriye Ruhi, Michael Haugh, Thomas Schmidt & Kai Wörner (eds), 1–17. Newcastle upon Tyne: Cambridge Scholars.
Sinclair, John. 1997. Corpus
evidence in language description. In Teaching and
Language Corpora, Anne Wichmann, Steven Fligelstone, Tony McEnery & Gerry Knowles (eds), 27–39. London: Routledge.
. 2004. Introduction. In How
to Use Corpora in Language Teaching [Studies in Corpus Linguistics
12], John Sinclair (ed.), 1–10. Amsterdam: John Benjamins.
Surcouf, Christian. 2020. Les
enjeux de la compréhension du français oral quotidien en FLE: Atouts possibles d’un corpus de français parlé
annoté à des fins pédagogiques. Études de linguistique
appliquée 198: 241–256.
. 2021. Le
français oral quotidien, un objectif spécifique en FLE? Retour sur les défis de la création d’un corpus de
français parlé annoté à visée pédagogique. In Des
corpus numériques à l’analyse linguistique en langues de spécialité, Cécile Frérot & Mojca Pecman (eds), 107–133. Grenoble: UGA Éditions.
. 2025. À
pas de loup dans la bergerie… La problématique du silence dans l’étiquetage automatique du Subjonctif Présent
en français parlé. Corpus 26.
Surcouf, Christian & Giroud, Anick. 2016. À
quelle langue accède l’apprenant? Examen critique du traitement de l’oral dans les premières leçons de manuels
de français langue étrangère pour débutants. Linguistik
Online 78(4): 11–27.
Surcouf, Christian & Ausoni, Alain. 2018. Création
d’un corpus de français parlé à des fins pédagogiques en FLE: La genèse du projet
FLORALE. EDL: Études en didactique des
langues 31: 71–91.
. 2021. Variation
phonétique et compréhension du français parlé spontané en FLE. Les langues
modernes 2021(2): 25–35.
. 2022. “Le
français parlé? eh ben j’savais pas ce que c’était!”: Production et compréhension de la variation diaphasique
en français parlé en FLE. Mélanges
CRAPEL 43(1): 130–156.
Tribble, Christopher. 2015. Teaching
and language corpora. Perspectives from a personal
journey. In Multiple Affordances of Language Corpora
for Data-driven Learning [Studies in Corpus Linguistics
69], Agnieszka Leńko-Szymańska & Alex Boulton (eds), 37–62. Amsterdam: John Benjamins.
Tyne, Henry & Cavalla, Cristelle. Forthcoming. Authenticity
in language teaching and learning. In Manual of
Pedagogical Linguistics, Elissa Pustka & Daniel Riemann (eds). Berlin: De Gruyter.
Valdman, Albert. 2002. The
acquisition of sociostylistic and sociopragmatic variation by instructed second language learners: The
elaboration of pedagogical norms. In The
Sociolinguistics of Foreign Language Classrooms: Contributions of the Native, Near-native, and the Non-native
Speaker, Carl Blyth (ed.), 57–78. Boston MA: Heinle.
Vandergrift, Larry. 2008. Learning
strategies for listening comprehension. In Language
Learning Strategies in Independent Settings, Stella Hurd & Tim Lewis (eds), 84–102. Bristol: Multilingual Matters.
. 2013. Teaching
listening. In The Encyclopedia of Applied
Linguistics, Carol A. Chapelle (ed.), 1–8. Chichester: Wiley-Blackwell.
Vandergrift, Larry & Goh, Christine C. M. 2012. Teaching and
Learning Second Language Listening: Metacognition in Action. New York NY: Routledge.
Vialleton, Élodie & Lewis, Tim. 2014. Reconsidering
the authenticity of speech in French language teaching: Theory, data, methodology, and
practice. In French through Corpora: Ecological and
Data-driven Perspectives in French Language Studies, Henry Tyne, Virginie André, Christophe Benzitoun, Alex Bolton & Yan Greub (eds), 293–316. Newcastle upon Tyne: Cambridge Scholars.
Vyatkina, Nina & Boulton, Alex. 2017. Corpora
in language learning and teaching. Language Learning &
Technology 21(3): 1–8.
Wagner, Elvis. 2014. Using
unscripted spoken texts in the teaching of second language listening. TESOL
Journal 5(2): 288–311.
Weir, Cyril J. & Vidaković, Ivana. 2013. The
measurement of listening ability
1913–2012. In Measured Constructs: A History of
Cambridge English Language Examinations 1913–2012, Cyril J. Weir, Ivana Vidaković & Evalina D. Galaczi (eds), 347–419. Cambridge: CUP.
