Article published In: Perspectives on Chinese Language Acquisition
Edited by Henghua Su
[International Journal of Chinese Linguistics 12:2] 2025
► pp. 286–314
Enhancing Mandarin tone acquisition
Computer-assisted pronunciation training in self‑directed L2 learning
Published online: 6 October 2025
https://doi.org/10.1075/ijchl.00040.wu
https://doi.org/10.1075/ijchl.00040.wu
Abstract
This study investigates the effectiveness of a Computer-Assisted Pronunciation Training (CAPT) system designed to
enhance Mandarin lexical tone acquisition among second language (L2) learners in a self-directed learning context. Drawing on
automated speech recognition (ASR) technology, the CAPT system provides real-time, individualized feedback on tone production
accuracy. Elementary-level Mandarin learners at an international university in China were randomly assigned to either an
experimental group engaged with the CAPT system or a control group using traditional listen-and-repeat practice. Over a two-week
period, participants completed pre- and post-tests evaluating their production of four Mandarin tones at the sentence level.
Statistical analyses revealed that the experimental group achieved significant gains in tonal accuracy across all four tones,
while the control group showed no notable improvement. Survey data further supported the CAPT system’s usability and pedagogical
value, with learners demonstrating increased awareness of tone patterns and pronunciation strategies. These findings highlight the
efficacy of ASR-based CAPT for supporting tone acquisition in tonal languages like Mandarin, especially in supplementing classroom
instruction for pronunciation development. The study addresses the underrepresented research in CAPT for L2 Mandarin learning and
introduces an effective tool for L2 Mandarin learners and instructors.
Article outline
- 1.Introduction
- 1.1Pronunciation in learning and teaching
- 1.2Acquisition of pronunciation
- 1.3CAPT development
- 1.4Mandarin pronunciation and tone acquisition
- 1.5Research questions
- 2.Method
- 2.1Participants
- 2.2Materials
- 2.3Design and procedure
- 2.4Data analysis
- 3.Results
- 3.1Comparison of ASR-CAPT and control groups’ overall pronunciation accuracy
- 3.2Improvement across four mandarin tones
- 3.3Learner feedback and survey analysis
- 3.3.1Quantitative survey results
- 3.3.2Qualitative feedback on learning strategies
- 3.3.3Tone-specific perceptions and metacognitive awareness
- 3.3.4Control group feedback and desired features
- 4.Discussion
- 4.1The contribution to L2 mandarin acquisition
- 4.2The necessity of effective feedback in L2 learning
- 4.3The role of personalized pronunciation learning and self-directed learning
- 4.4The effectiveness of technology-mediated pronunciation instruction
- 4.5Implications for future CAPT system development
- 5.Limitations
- 6.Conclusion
References
References (61)
AbuSeileek, A. (2007). Computer-assisted
pronunciation instruction as an effective means for teaching stress. The JALT CALL
Journal, 3(1–2), 3–24.
Agarwal, C., & Chakraborty, P. (2019). A
review of tools and techniques for computer aided pronunciation training (CAPT) in
English. Education and Information
Technologies, 24(6), 3731–3743.
Almusharraf, A. (2024). Pronunciation
instruction in the context of world English: Exploring university EFL instructors’ perceptions and
practices. Humanities and Social Sciences
Communications, 11(1), 1–11.
Almusharraf, A., Mahdi, H. S., Al-Nofaie, H., & Aljasser, A. (2024). The
effect of settings, educational level and tools on computer-assisted pronunciation training: A
meta-analysis. Journal of Computer Assisted
Learning, 40(4), 1605–1615.
Amrate, M., & Tsai, P. H. (2024). Computer-assisted
pronunciation training: A systematic
review. ReCALL, 1–21.
Blin, F. (2004). CALL
and the development of learner autonomy: Towards an activity-theoretical
perspective. ReCALL, 16(2), 377–395.
Bozorgian, H., & Shamsi, E. (2020). Computer-assisted
pronunciation training on Iranian EFL learners’ use of suprasegmental features: A case
study. Computer-Assisted Language Learning Electronic
Journal, 21(2), 93–113.
Cao, M., Pavlik, P. I., Jr., & Bidelman, G. M. (2024). Enhancing
lexical tone learning for second language speakers: Effects of acoustic properties in Mandarin tone
perception. Frontiers in
Psychology, 151, Article 1403816.
Chen, M. (2024). Computer-aided
feedback on the pronunciation of Mandarin Chinese tones: Using Praat to promote multimedia foreign language
learning. Computer Assisted Language
Learning, 37(3), 363–388.
Corral, D., & Yang, M. (2024). An
introduction to the difference-in-differences design in education policy research. Asia Pacific
Education
Review, 25(3), 663–672.
Correa, M., & Grim, F. (2014). Audio
recordings as a self-awareness tool for improving second language pronunciation in the phonetics and phonology classroom:
Sample activities. Currents in Teaching &
Learning, 6(2).
Darcy, I., Rocca, B., & Hancock, Z. (2021). A
window into the classroom: How teachers integrate pronunciation instruction. RELC
Journal, 52(1), 110–127.
DeKeyser, R. (2017). Knowledge
and skill in ISLA. In S. Loewen & M. Sato (Eds.), The
Routledge handbook of instructed second language
acquisition (pp. 15–32). Routledge.
Deng, D., & Tang, Y. (2024). Pitch
realization in Chinese prosodic expression for L2 learners. Chinese Language
Learning, 2024(3), 94–103.
Derwing, T. M., & Munro, M. J. (2005). Second
language accent and pronunciation teaching: A research-based approach. TESOL
Quarterly, 39(3), 379–397.
Derwing, T. M., Rossiter, M. J., & Munro, M. J. (2002). Teaching
native speakers to listen to foreign-accented speech. Journal of Multilingual and Multicultural
Development, 23(4), 245–259.
Gatbonton, E., & Segalowitz, N. (2005). Rethinking
communicative language teaching: A focus on access to fluency. Canadian Modern Language
Review, 61(3), 325–353.
Gilakjani, A. P., & Ahmadi, M. R. (2011). Why
is pronunciation so difficult to learn? English Language
Teaching, 4(3), 74–83.
Gordon, J., Darcy, I., & Ewert, D. (2012). Pronunciation
teaching and learning: Effects of explicit phonetic instruction in the L2
classroom. Pronunciation in Second Language Learning and Teaching
Proceedings, 4(1).
Hao, Y. C. (2012). Second
language acquisition of Mandarin Chinese tones by tonal and non-tonal language
speakers. Journal of
Phonetics, 40(2), 269–279.
Jiang, X., & Cohen, A. D. (2018). Learner
strategies for dealing with pronunciation issues in
Mandarin. System, 761, 25–37.
Kennedy, S., & Trofimovich, P. (2010). Language
awareness and second language pronunciation: A classroom study. Language
Awareness, 19(3), 171–185.
Lee, J., Jang, J., & Plonsky, L. (2015). The
effectiveness of second language pronunciation instruction: A meta-analysis. Applied
Linguistics, 36(3), 345–366.
Levis, J. M. (2005). Changing
contexts and shifting paradigms in pronunciation teaching. TESOL
Quarterly, 39(3), 369–377.
(2007). Computer
technology in teaching and researching pronunciation. Annual Review of Applied
Linguistics, 271, 184–202.
Levis, J. M., & Grant, L. (2003). Integrating
pronunciation into ESL/EFL classrooms. TESOL
Journal, 12(2), 13–19.
Li, S. (2010). The
effectiveness of corrective feedback in SLA: A meta-analysis. Language
learning, 60(2), 309–365.
Liang, C., Chou, W. S., Yu-Ling, H., & Chien-Chien, Y. (2009). A
user-centred design approach to develop a web-Based instructional resource system for homeland
education. Knowledge Management &
E-Learning, 1(1), 67.
Lin, J., Xie, Y., & Zhang, J. (2016). Automatic
pronunciation evaluation of non-native Mandarin tone by using multi-level confidence
measures. Interspeech, 2666–2670.
Lippi-Green, R. (2012). English
with an accent: Language, ideology and discrimination in the United
States. Routledge.
Liu, Y., Wang, M., Perfetti, C. A., Brubaker, B., Wu, S., & MacWhinney, B. (2011). Learning
a tonal language by attending to the tone: An in vivo experiment. Language
Learning, 61(4), 1119–1141.
Luo, B. (2016). Evaluating
a computer-assisted pronunciation training (CAPT) technique for efficient classroom
instruction. Computer Assisted Language
Learning, 29(3), 451–476.
Lyster, R., Saito, K., & Sato, M. (2013). Oral
corrective feedback in second language classrooms. Language
teaching, 46(1), 1–40.
Ma, R., Henrichsen, L. E., Cox, T. L., & Tanner, M. W. (2018). Pronunciation’s
role in English speaking-proficiency ratings. Journal of Second Language
Pronunciation, 4(1), 73–102.
Martin, I. A. (2020). Pronunciation
can be acquired outside the classroom: Design and assessment of homework-based training. Modern
Language
Journal, 104(2), 457–479.
McCrocklin, S. M. (2016). Pronunciation
learner autonomy: The potential of automatic speech
recognition. System, 57(1), 25–42.
Moore, C. B., & Jongman, A. (1997). Speaker
normalization in the perception of Mandarin Chinese tones. The Journal of the Acoustical
Society of
America, 102(3), 1864–1877.
Morris, T. H. (2019). Self-directed
learning: A fundamental competence in a rapidly changing world. International Review of
Education, 65(4), 633–653.
Mugglestone, L. (2003). Talking
proper: The rise of accent as social symbol. Oxford University Press.
Munro, M. J., & Derwing, T. M. (2015). A
prospectus for pronunciation research in the 21st century: A point of view. Journal of Second
Language
Pronunciation, 1(1), 11–42.
Pelzl, E., Lau, E. F., Guo, T., & DeKeyser, R. (2021). Even
in the best-case scenario L2 learners have persistent difficulty perceiving and utilizing tones in Mandarin: Findings from
behavioral and event-related potentials experiments. Studies in Second Language
Acquisition, 43(2), 268–296.
Pennington, M. C., & Richards, J. C. (1986). Pronunciation
revisited. TESOL
Quarterly, 20(2), 207–225.
Pennington, M. C., & Rogerson-Revell, P. (2019). English
pronunciation teaching and research. Londres: Palgrave
Macmillan, 101, 978–988.
Purnell, M. C., & Hughes, J. (2023). Practicing
cultural humility by using actionable steps for improving name pronunciation and use. American
Journal of Pharmaceutical
Education, 87(7), Article
100043.
Roccamo, A. (2015). Teaching
pronunciation in just ten minutes a day: A method for pronunciation instruction in first-semester German language
classrooms. Die Unterrichtspraxis / Teaching
German, 48(1), 59–83.
Rogerson-Revell, P. M. (2021). Computer-assisted
pronunciation training (CAPT): Current issues and future directions. RELC
Journal, 52(1), 189–205.
Saito, K., & Plonsky, L. (2019). Effects
of second language pronunciation teaching revisited: A proposed measurement framework and
meta-analysis. Language
Learning, 69(3), 652–708.
Wang, T., & Saffran, J. R. (2014). Statistical
learning of a tonal language: The influence of bilingualism and previous linguistic
experience. Frontiers in
Psychology, 51, 953.
Wang, Z., Zechner, K., & Sun, Y. (2018). Monitoring
the performance of human and automated scores for spoken responses. Language
Testing, 35(1), 101–120.
Wei, W., & Zhang, J. (2018). An
intelligent Chinese pronunciation teaching app and the preliminary result of a teaching
experiment. Journal of Technology and Chinese Language
Teaching, 9(2), 83–97.
Wu, Y., & Shen, X. (2024). The
assessment of automated rating of L2 Mandarin prosody in lexical tone recognition and
pauses. In Proceedings of Speech
Prosody. International Speech Communication Association (ISCA). 250–254.
Wu, Y., Adda-Decker, M., & Lamel, L. (2023). Mandarin
lexical tone duration: Impact of speech style, word length, syllable position and prosodic
position. Speech
Communication, 1461, 45–52.
(2013). ProsodyPro:
A tool for large-scale systematic prosody analysis. Proceedings of Tools and Resources for
the Analysis of Speech Prosody (TRASP
2013), Aix-en-Provence, 7–10.
Yang, B. (2015). Perception
and production of Mandarin tones by native speakers and L2
learners. Springer.
Yang, X. (2023). Speech
rate and sentence length’s influence on perception of Mandarin tone sandhi. Studies in Social
Science &
Humanities, 2(11), 15–19.
