ASR-based dictation practice for second language pronunciation improvement

McCrocklin, Shannon

doi:10.1075/jslp.16034.mcc

Article published In: Journal of Second Language Pronunciation
Vol. 5:1 (2019) ► pp.98–118

Get fulltext from our e-platform

Download PDF

ASR-based dictation practice for second language pronunciation improvement

Shannon McCrocklin | Southern Illinois University

Published online: 13 March 2019

https://doi.org/10.1075/jslp.16034.mcc

Abstract

In pronunciation learning, there is a need for resources and tools that help students monitor their speech or provide feedback on errors. While researchers have seen ASR-based technologies as potential tools, little attention has been paid to dictation programs, which have been criticized for low levels of recognition, but offer advantages such as accessibility and flexibility. This study examines two groups of learners in a pronunciation workshop: CONV, which had fully face-to-face instruction, and HYBRID, which had half of the instruction face-to-face and half using the computer, practicing production using a dictation program, Windows Speech Recognition. Results show that both groups improved from pre- to post-test and that there were no statistically significant differences between the two groups. Results indicate that dictation programs may be useful as a complement to face-to-face pronunciation teaching, especially if in-class time for pronunciation teaching is limited.

Keywords: pronunciation, teaching, Automatic Speech Recognition, CAPT, English as a Second Language, technology

Article outline

1.Introduction
- 1.1Autonomy and strategy use in language learning
- 1.2Computer assisted pronunciation training
- 1.3Benefits of ASR technology for pronunciation learning
- 1.4Choosing dictation programs
- 1.5Research question
2.Methodology
- 2.1Participants
- 2.2Workshop design
- 2.3Language learning logs
- 2.4Pre- & post-workshop recordings
- 2.5Raters
- 2.6Analysis of ratings
3.Results
- 3.1Time spent on activities
- 3.2Inter-rater reliability
- 3.3Student improvement
4.Discussion and conclusion
Note
References

References (54)

References

Banafa, F. H. (2008). Effects of IT on pronunciation. LaVergne, TN: Lightning Source Inc.

Beddor, P. S. & Strange, W. (1982). Cross-language study of perception of the oral-nasal distinction. Journal of the Acoustical Society of America, 711, 1551–1561.

Benson, P. & Voller, P. (1997). Introduction: Autonomy and independence in language learning. In P. Benson & P. Voller (Eds.), Autonomy and independence in language learning (pp. 1–12). Edinburgh, UK: Addison Wesley Longman Ltd.

Blankenship, B. (1991). Second language vowel perception. Journal of the Acoustical Society of America, 901, 2252–2252.

Bouselmi, G., Fohr, D., & Illina, I. (2012). Multilingual recognition of non-native speech using acoustic model transformation and pronunciation modeling. International Journal of Speech Technology, 151, 203–213.

Brown, A. (1988). Functional load and the teaching of pronunciation. TESOL Quarterly, 22 (4), 593–606.

Burlington English: Meeting the challenges of adult language acquisition. (2014). Retrieved from [URL]

Celce-Murcia, M., Brinton, D., & Goodwin, J. (2010). Teaching pronunciation (2nd ed.). Cambridge, England: Cambridge University Press.

Chang, C. (2012). Instruction on pronunciation learning strategies: Research findings and current pedagogical approaches (Master’s thesis). Retrieved from the University of Texas Digital Repository. UT-Austin, Austin, TX.

Cincarek, T., Gruhn, R., Hacker, C., Nöth, E., & Nakamura, S. (2008). Automatic pronunciation scoring of words and sentences independent from the non-native’s first language. Computer Speech and Language, 231, 65–88.

Coniam, D. (1999). Voice recognition software accuracy with second language speakers of English. System, 271, 49–64.

Cordier, D. (2009). Speech recognition software for language learning: Toward an evaluation of validity and student perceptions (Doctoral dissertation). Retrieved from: [URL]

Cucchiarini, C., Strik, H., & Boves, L. (2000). Different aspects of expert pronunciation quality ratings and their relation to scores produced by speech recognition algorithms. Speech Communication, 301, 109–119.

Derwing, T., Munro, M., & Carbonaro, M. (2000). Does popular speech recognition software work with ESL speech? TESOL Quarterly, 34 (3), 592–603.

Dickerson, W. B. (1994). Empowering students with predictive skills. In J. Morley (Ed.), Pronunciation pedagogy and theory: New views, new directions (pp. 17–35). Alexandria, VA: TESOL, Inc.

Dickinson, L. (1995). Autonomy and motivation: a literature review. System, 23(2), 165–174.

Dryer, M. S. & Haspelmath, M. (Eds.). (2013). The world atlas of language structures online. Retrieved from [URL]

Foote, J. A. (2010). Second language learners’ perceptions of their own recorded speech. PMC Working Paper Series W P10-02, 3–27. Edmonton, Alberta: Prairie Metropolis Centre.

Flege, J. E., Munro, M. J., & Fox, R. A. (1993). Auditory and categorical effects on cross-language vowel perception. Journal of the Acoustical Society of America, 951, 3623–3641.

Fraser, H. (2001). Teaching pronunciation: A handbook for teachers and trainers. Sydney, Australia: TAFE NSW.

Furtak, E. M. & Kunter, M. (2012). Effects of autonomy-supportive teaching on student learning and motivation. The Journal of Experimental Education, 801, 284–316.

Hardison, D. (2003). Acquisition of second language speech: Effects of visual cues, context, and talker variability. Applied Psycholinguistics, 261, 579–596.

Hardison, D. M. (2004). Generalization of computer-assisted prosody training: quantitative and qualitative findings. Language Learning and Technology, 8 (1), 34–52.

Hardison, D. (2005). Second language spoken word identification: Effects of training, visual cues, and phonetic environment. Applied Psycholinguistics, 261, 579–596.

Hincks, R. (2003). Speech technologies for pronunciation feedback and evaluation. ReCALL. 15(1), 3–20.

(2015). Technology and leaning pronunciation. In Reed, M. and Levis, J. (Eds). The handbook of English pronunciation (pp. 505–519). Malden, MA: John Wiley & Sons, Inc..

Holec, H. (1981). Autonomy in foreign language learning. Oxford: Pergamon Press.

Ingels, S. A. (2011). The effects of self-monitoring strategy use on the pronunciation of learners of English (Doctoral dissertation). University of Illinois, Urbana, IL.

Larson-Hall, J. (2010). A guide to doing statistics in second language research using SPSS. New York, NY: Taylor & Francis.

Levis, J. & Pickering, L. (2004). Teaching intonation in discourse using speech visualization technology. System, 321, 505–524.

Levis, J. & Suvorov, R. (2013). Automated speech recognition. In C. Chapelle (Ed.) The encyclopedia of applied linguistics. Retrieved from [URL]

Lima, E. (2015). Development and evaluation of online pronunciation instruction for international teaching assistants’ comprehensibility (Doctoral Dissertation). Retrieved from: [URL].

McCrocklin, S. (2012). Effect of audio vs. video on aural discrimination of vowels. TESL-EJ, 16 (2), 1–16.

(2016). Pronunciation learner autonomy: The potential of Automatic Speech Recognition. System, 571, 25–42.

Meisel, W. (2010). “Life on-the-go”: The role of speech technology in mobile applications. In A. Neustein (Ed.) Advances in speech recognition: Mobile environments, call centers, and clinics. New York, NY: Springer.

Microsoft. (2014). Microsoft accessibility: Products: Windows Speech Recognition. Retrieved from [URL]

Moustroufas, N. & Digalakis, V. (2007). Automatic pronunciation evaluation of foreign speakers using unknown text. Computer Speech and Language, 211, 219–230.

Murray, G. (1999). Autonomy, technology, and language-learning in a sheltered ESL immersion program. TESL Canada Journal, 17 (1), 1–15.

Neri, A., Cucchiarini, C., & Strik, H. (2006). ASR-based corrective feedback on pronunciation: does it really work? Proceedings of the ISCA Interspeech 2006, Pittsburgh, PA, 1982–1985.

Neri, A., Cucchiarini, H., Strik, L. & Boves, L. (2002). The pedagogy-technology interface in Computer-Assisted Pronunciation Teaching. Computer Assisted Language Learning, 15(5), 441–467.

Neri, A., Mich, O., Gerosa, M., & Giuliani, D. (2008). The effectiveness of computer assisted pronunciation training for foreign language learning by children. Computer Assisted Language Learning, 21(5), 393–408.

Nilsen, D. L. & Nilsen, A. P. (2002). Pronunciation contrasts in English. Prospect Heights, IL: Waveland Press.

Oxford, R. L. (1990). Language learning strategies: What every teacher should know. Boston: Heinle & Heinle.

Rogerson-Revell, P. (2011). English phonology and pronunciation teaching. Broadway, NY: Bloomsbury Academic.

Rosetta Stone: Cutting-edge technology. (2015). Retrieved from [URL]

Saito, K. & Lyster, R. (2012). Effects of form-focused instruction and corrective feedback on L2 pronunciation development of /ɹ/ by Japanese learners of English. Language Learning, 62(2), 595–633.

Saraçlar, M. (2000). Pronunciation modeling for conversational speech recognition (Doctoral dissertation). Ann Arbor, MI: UMI Dissertation Services.

Sardegna, V. (2009). Improving English stress through pronunciation learning strategies. Dissertation Abstracts International: Section A. The Humanities and Social Sciences, 70(1), 0114.

Sardegna, V. G. (2012). Learner differences in strategy use, self-efficacy beliefs, and pronunciation improvement. In. J. Levis & K. LeVelle (Eds.). Proceedings of the 3rd Pronunciation in Second Language Learning and Teaching Conference, Ames, IA: Iowa State University, 39–53.

Seferoğlu, G. (2005). Improving students’ pronunciation through accent reduction software. British Journal of Educational Technology, 36 (2), 303–316.

Sheerin, S. (1997). An exploration of the relationship between self-access and independent learning. In P. Benson and P. Voller (Eds.), Autonomy and independence in language learning (pp. 54–65). London: Longman.

Strik, H., Neri, A., & Cucchiarini, C. (2008). Speech technology for language tutoring. Proceedings of Language and Speech Technology (LangTech ‘08) Conference, Rome, Italy, 73–76.

Tepperman, J. (2009). Hierarchical methods in automatic pronunciation evaluation. (Doctoral dissertation). Ann Arbor, MI: UMI Dissertation Services.

Truong, K., Neri, A., de Wet, F., Cucchiarini, C., & Strik, H. (2005). Automatic detection of frequent pronunciation errors made by L2-learners. Proceedings from InterSpeech 2005 (IS2005), Lisbon, Portugal, 1345–1348.

Cited by (43)

Cited by 43 other publications

Order by:

Chen, Hao-Jan, Fu-Chan Hsieh & Yu-Ci Yen

2025. Applying AI Technologies in Second Language Learning. In AI-Mediated Language Education in the Metaverse Era [Chinese Language Learning Sciences, ], ► pp. 101 ff.

Feng, Yisu & Ye Tian

2025. Assessing the accuracy of Chinese speech-to-text tools for Chinese as foreign language learners. Chinese as a Second Language (漢語教學研究—美國中文教師學會學報). The journal of the Chinese Language Teachers Association, USA

Huang, Fang, Dingyang Peng & Timothy Teo

2025. AI Affordances and EFL Learners' Speaking Engagement: The Moderating Roles of Gender and Learner Type. European Journal of Education 60:1

Jakonen, Teppo, Derya Duran & Pauliina Peltonen

2025. Situated L2 pronunciation instruction during small-group robot-assisted language learning activities. Language Teaching Research

Lao-un, Jiraporn & Dararat Khampusaen

2025. Developing an AI-Powered Pronunciation Application to Improve English Pronunciation of Thai ESP Learners. Languages 10:11 ► pp. 273 ff.

Leis, Adrian

2025. How speech-to-text technology affects pronunciation gains and self-confidence in EFL learners. Computer Assisted Language Learning ► pp. 1 ff.

Li, Ling, Xiaoyue Zhang, Bin Zou & Qin Yang

2025. AI partner or peer partner? Exploring AI-mediated interaction in EFL pronunciation from a socio-cultural perspective. Learning, Culture and Social Interaction 55 ► pp. 100958 ff.

Liu, Di, Chun Lai, Rongxin Lin & Tan Jin

2025. Pre‐Service Teachers’ Technology Integration in Pronunciation Instruction: Professional Identity and Value Beliefs. International Journal of Applied Linguistics 35:2 ► pp. 577 ff.

Liu, Yao, Faizahani binti Ab Rahman & Farah binti Mohamad Zain

2025. A systematic literature review of research on automatic speech recognition in EFL pronunciation. Cogent Education 12:1

Ll, Ke

2025. La enseñanza del inglés universitario en un entorno digital: la experiencia de China. Revista Española de Pedagogía 83:290 ► pp. 201 ff.

McCrocklin, Shannon

2025. Automatic Speech Recognition. In The Palgrave Encyclopedia of Computer-Assisted Language Learning, ► pp. 1 ff.

Munro, Murray J.

2025. Applied Phonetics and Phonology. In Reference Module in Social Sciences,

Ngo, Thuy Thi-Nhu & Howard Hao-Jan Chen

2025. The impact of ASR-based reading tutor on young EFL learners’ oral skills: a case study of Google Read Along. Computer Assisted Language Learning ► pp. 1 ff.

Rogti, Maroua

2025. Automatic pronunciation evaluation feedback and peer correction for shaping English pronunciation accuracy and interpersonal communication. Innovation in Language Learning and Teaching ► pp. 1 ff.

Shimanskaya, Elena

2025. Comparing L2 intelligibility for learners of French: Automatic speech recognition versus human listeners. Foreign Language Annals 58:2 ► pp. 438 ff.

Ueda, Ruri

2025. The effect of oral repetition during discrimination training on L2 perceptual learning. Journal of Second Language Pronunciation

Zou, Bin & Qing Ma

2025. AI-Based Approaches to Assessing Speaking. In The Palgrave Encyclopedia of Computer-Assisted Language Learning, ► pp. 1 ff.

Fouz-González, Jonás

2024. Using technology to facilitate the integration of pronunciation into the classroom. In La integración de la pronunciación en el aula de ELE [IVITRA Research in Linguistics and Literature, 42], ► pp. 207 ff.

Johnson, Carol, Walcir Cardoso, Beau Zuercher, Kathleen Brannen & Suzanne Springer

2024. Assessing pronunciation using dictation tools. Journal of Second Language Pronunciation 10:1 ► pp. 10 ff.

Levis, John M. & Charlie Nagle

2024. Bridging the Gap between Bilingual Phonetic Research and Pronunciation Teaching. In The Cambridge Handbook of Bilingual Phonetics and Phonology, ► pp. 791 ff.

Nickolai, Dan, Emma Schaefer & Paula Figueroa

2024. Aggregating the evidence of automatic speech recognition research claims in CALL. System 121 ► pp. 103250 ff.

Thi-Nhu Ngo, Thuy, Howard Hao-Jan Chen & Kyle Kuo-Wei Lai

2024. The effectiveness of automatic speech recognition in ESL/EFL pronunciation: A meta-analysis. ReCALL 36:1 ► pp. 4 ff.

Zou, Bin, Sara Liviero, Qing Ma, Weilei Zhang, Yiran Du & Peiling Xing

2024. Exploring EFL learners’ perceived promise and limitations of using an artificial intelligence speech evaluation system for speaking practice. System 126 ► pp. 103497 ff.

CENGİZ, Behice Ceyda

2023. Computer-Assisted Pronunciation Teaching: An Analysis of Empirical Research. Participatory Educational Research 10:3 ► pp. 72 ff.

Dai, Yuanjun & Zhiwei Wu

2023. Mobile-assisted pronunciation learning with feedback from peers and/or automatic speech recognition: a mixed-methods study. Computer Assisted Language Learning 36:5-6 ► pp. 861 ff.

Hirai, Akiyo & Angelina Kovalyova

2023. Using Speech-to-Text Applications for Assessing English Language Learners’ Pronunciation: A Comparison with Human Raters. In Optimizing Online English Language Learning and Teaching [English Language Education, 31], ► pp. 337 ff.

Inceoglu, Solène, Wen-Hsin Chen & Hyojung Lim

2023. Assessment of L2 intelligibility: Comparing L1 listeners and automatic speech recognition. ReCALL 35:1 ► pp. 89 ff.

Inceoglu, Solène, Wen-Hsin Chen & Hyojung Lim

2024. Monitoring student behavior in autonomous automatic speech recognition-based pronunciation practice. System 124 ► pp. 103387 ff.

Jiang, Michael Yi‐Chao, Morris Siu‐Yung Jong, Wilfred Wing‐Fat Lau, Ching‐Sing Chai & Na Wu

2023. Exploring the effects of automatic speech recognition technology on oral accuracy and fluency in a flipped classroom. Journal of Computer Assisted Learning 39:1 ► pp. 125 ff.

Liu, Di, Hyangeun Ji & Dominique Kliger

2023. Stealth Assessment in Game-Based and Task-Based Language Teaching. In Games as Stealth Assessments [Advances in Educational Technologies and Instructional Design, ], ► pp. 289 ff.

Pack, Austin & Jeffrey Maloney

2023. Using Generative Artificial Intelligence for Language Education Research: Insights from Using OpenAI's ChatGPT. TESOL Quarterly 57:4 ► pp. 1571 ff.

Sun, Weina

2023. The impact of automatic speech recognition technology on second language pronunciation and speaking skills of EFL learners: a mixed methods investigation. Frontiers in Psychology 14

Vančová, Hana

2023. AI and AI-powered tools for pronunciation training. Journal of Language and Cultural Education 11:3 ► pp. 12 ff.

Chun, Dorothy M. & Yan Jiang

2022. Using Technology to Explore L2 Pronunciation. In Second Language Pronunciation, ► pp. 129 ff.

Jiang, Michael Yi-Chao, Morris Siu-Yung Jong, Na Wu, Bin Shen, Ching-Sing Chai, Wilfred Wing-Fat Lau & Biyun Huang

2022. Integrating Automatic Speech Recognition Technology Into Vocabulary Learning in a Flipped English Class for Chinese College Students. Frontiers in Psychology 13

Lai, Kuo-Wei Kyle & Hao-Jan Howard Chen

2022. An exploratory study on the accuracy of three speech recognition software programs for young Taiwanese EFL learners. Interactive Learning Environments ► pp. 1 ff.

Busà, Maria Grazia

2021. Cenerentola entra a palazzo: il nuovo ruolo della pronuncia nell’insegnamento linguistico . EL.LE :3

Evers, Katerina & Sufen Chen

2021. Effects of Automatic Speech Recognition Software on Pronunciation for Adults With Different Learning Styles. Journal of Educational Computing Research 59:4 ► pp. 669 ff.

Evers, Katerina & Sufen Chen

2022. Effects of an automatic speech recognition system with peer feedback on pronunciation instruction for adults. Computer Assisted Language Learning 35:8 ► pp. 1869 ff.

Oh, Eun Young & Donggil Song

2021. Developmental research on an interactive application for language speaking practice using speech recognition technology. Educational Technology Research and Development 69:2 ► pp. 861 ff.

McCrocklin, Shannon & Idée Edalatishams

2020. Revisiting Popular Speech Recognition Software for ESL Speech. TESOL Quarterly 54:4 ► pp. 1086 ff.

Levis, John M.

2019. Teaching-oriented research. Journal of Second Language Pronunciation 5:1 ► pp. 1 ff.

Levis, John M.

2024. Key issues in L2 pronunciation research. Journal of Second Language Pronunciation 10:3 ► pp. 293 ff.

This list is based on CrossRef data as of 13 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.