In:Interpreting Technologies – Current and Future Trends
Edited by Gloria Corpas Pastor and Bart Defrancq
[IVITRA Research in Linguistics and Literature 37] 2023
► pp. 241–276
Chapter 10Automatic speech translation in the classroom and lecture
setting
Challenges, approaches, and future directions
Published online: 9 October 2023
https://doi.org/10.1075/ivitra.37.10lew
https://doi.org/10.1075/ivitra.37.10lew
Abstract
With dramatic improvements in quality for the
technologies underlying Speech Translation, e.g., Speech Recognition and
Machine Translation, the potential viability of Speech Translation in
certain scenarios may finally be within reach. This is no truer than in
educational settings where, with ever-growing immigration and an
increasingly global workforce, multilingual classrooms and educational
settings have become the norm around the world rather than the exception.
The problem is that many educational institutions are faced with the
daunting challenge of meeting the needs of upwards of 30–100 language
communities simultaneously. Scenarios include providing translated content
or instruction to linguistically diverse student populations, often in the
same classroom, and parent-educator interactions, sometimes individually but
also often in group settings. In the former scenario, speech translation
technology can be a bridge between the student’s home language and the
dominant language used in the classroom and an aid for learning the dominant
language. In the latter scenario, parents, unlike their children, may never
achieve proficiency in the dominant language(s), yet still need to be
involved in their children’s education. The technology can provide them
access where otherwise there may be none, or where the options may be
severely limited. The large-scale multilingual requirements of these
immigrant and diverse communities when interacting with educators generally
defies reliable human-centric solutions and begs for technological ones.
Article outline
- 1.Introduction
- 2.The educational setting
- 2.1The dilemma and the magnitude of the problem
- 2.2Studies and motivations
- 3.Speech translation technology
- 3.1History of speech translation
- 3.2The architecture of speech translation systems
- 3.3The individual components
- 3.4How quality is measured
- 3.5Technological progress
- 4.Speech translation in the schools: Scenarios and use cases
- 4.1Chinook middle school
- 4.2Karlsruhe institut für technologie (KIT)
- 5.Challenges and future directions
- 5.1In the schools: What we know and what we do not know
- 5.2Technical challenges
- 5.3Potential impacts on the interpretation industry
- 5.4Conclusion
Notes References
References (51)
Akhbardeh, Farhad, Arkhangorodsky, Arkady, Biesialska, Magdalena, Bojar, Ondřej, Chatterjee, Rajen, Chaudhary, Vishrav, Costa-jussa, Marta R., España-Bonet, Cristina, Fan, Angela, Federmann, Christian, Freitag, Markus, Graham, Yvette, Grundkiewicz, Roman, Haddow, Barry, Harter, Leonie, Heafield, Kenneth, Homan, Christopher, Huck, Matthias, Amponsah-Kaakyire, Kwabena, Kasai, Jungo, Khashabi, Daniel, Knight, Kevin, Kocmi, Tom, Koehn, Philip, Lourie, Nicholas, Monz, Christof, Morishita, Makoto, Nagata, Masaaki, Nagesh, Ajay, Nakazawa, Toshiaki, Negri, Matteo, Pal, Santanu, Tapo, Allahsera Auguste, Turchi, Marco, Vydrin, Valentin, and Marcos Zampieri. 2021. “Findings
of the 2021 Conference on Machine Translation
(WMT21).” In Proceedings
of the Sixth Conference on Machine
Translation, 1–88. [Online] Available
at [URL].
Anastasopoulos, Antonis, Bojar, Ondřej, Bremerman, Jacob, Cattoni, Roldano, Elbayad, Maha, Federico, Marcello, Ma, Xutai, Nakamura, Satoshi, Negri, Matteo, Niehues, Jan, Pino, Juan, Salesky, Elizabeth, Stüker, Sebastian, Sudoh, Katsuhito, Turchi, Marco, Waibel, Alexander, Wang, and Matthew Wiesner. 2021. “Findings
of the IWSLT 2021 Evaluation
Campaign.” In Proceedings
of the 18th International Conference on Spoken Language
Translation (IWSLT
2021), 1–29. [Online] Available
at
Ansari, Ebrahim, Axelrod, Amittai, Bach, Nguyen, Bojar, Ondřej, Cattoni, Roldano, Dalvi, Fahim, Durrani, Nadir, Marcello, Federico, Federmann, Christian, Gu, Jiatao, Huang, Fei, Knight, Kevin, Ma, Xutai, Nagesh, Ajay, Negri, Matteo, Niehues, Jan, Pino, Juan, Salesky, Elizabeth, Shi, Xing, Stucker, Sebastian, Shi, Sebastian, Turchi, Marco, Waibel, Alex, and Changhan Wang. 2020. “Findings
of the IWSLT 2020 evaluation
campaign”. In Proceedings
of the 17th International Conference on Spoken Language
Translation (IWSLT
2020), edited
by Marcello Federico, Alex Waibel, Kevin Knight, Satoshi Nakamura, Hermann Ney, Jan Niehues, Sebastian Stüker, Dekai Wu, Joseph Mariani, and François Yvon, 1–34. [Online] Available
at [URL].
Bhandari, Rajika. 2018. “A
World on the Move: Trends in Global Student
Mobility.” [URL]. Retrieved September 2,
2021.
Cettolo, Mauro, Niehues, Jan, Stüker, Sebastian, Bentivoldi, Luisa, and Marcello Federico. 2013. “Report
on the 10th IWSLT evaluation
campaign”. In Proceedings
of the Tenth International Workshop on Spoken Language
Translation (IWSLT
2013), edited
by Joy Ying Zhang, s.l. [Online] Available
at [URL].
. 2014. “Report
on the 11th IWSLT evaluation campaign, IWSLT
2014”. In Proceedings
of the Eleventh International Workshop on Spoken Language
Translation (IWSLT
2014). [URL]
Cettolo, Mauro, Niehues, Jan, Stüker, Sebastian, Bentivoldi, Luisa, Cattoni, Roldano, and Marcello Federico. 2015. “The
IWSLT 2015 Evaluation
Campaign”. In Proceedings
of the Twelfth International Workshop on Spoken Language
Translation (IWSLT
2015), edited
by Marcello Federico, Sebastian Stüker, and François Yvon, s.l. [Online] Available
at [URL].
Cho, Eunah, Fügen, Christian, Hermann, Teresa, Kilgour, Kevin, Mediani, Mohammed, Mohr, Christian, Niehues, Jan, Rottmann, Kay, Saam, Christian, Stüker, Sebastian, and Alex Waibel. 2013. “A
Real-World System for Simultaneous Translation of German
Lectures.” In Proceedings
of INTERSPEECH 2013, edited
by Frédéric Bimbot, Christophe Cerisara, Cécile Fougeron, Guillaume. Gravier, Lori Lamel, François Pellegrino, and Pascal Perrier, 3473–3477. Lyon: ISCA.
Corpas Pastor, Gloria. 2018. “Tools
for interpreters: The challenges that lie
ahead.” Current Trends in Translation
Teaching and Learning E (CTTL
E) 5: 157–182.
DAAD (Deutscher Akademischer
Austauschdienst). 2021. “Hochschulkompass.” [Online] Available
at [URL]. Retrieved September 2,
2021.
Dessloch, Florian, Ha, Thanh-Le, Müller, Markus, Niehues, Jan, Nguyen, Thai-Son, Pham, Ngoc-Quan, Salesky, Elizabeth, Sperber, Matthias, Stüker, Sebastian, Zenkel, Thomas, and Alexander Waibel. 2018. “KIT
Lecture Translator: Multilingual Speech Translation with One-Shot
Learning”. In Proceedings
of the 27rd International Conference on Computational
Linguistics (COLING
2018), edited
by Dongyan Zhao, s.l. 89–93. [Online] Available
at [URL]
Doherty, Stephen. 2016. “The
Impact of Translation Technologies on the Process and Product of
Translation.” International Journal
of
Communication 10: 947–969.
Firat, Orhan, Cho, Kyunghyun, and Bengio, Yoshua. 2016. “Multi-Way,
Multilingual Neural Machine Translation with a Shared Attention
Mechanism”. In Proceedings
of the 2016 Conference of the North American Chapter of the
Association for Computational Linguistics: Human Language
Technologies, edited
by Kevin Knight, Ani Nenkova, and Owen Rambow, 866–875. San Diego: Association for Computational Linguistics. [Online] Available
at [URL].
Fügen, Christian, Kolss, Muntsin, Bernreuther, Dietmar, Paulik, Matthias, Stüker, Sebastian, Vogel, Stephan, and Alexander Waibel. 2006. “Open
Domain Speech Recognition & Translation: Lectures and
Speeches”. In International
Conference on Acoustics, Speech, and Signal Processing
Proceedings, I–I. Tolouse: IEEE.
Fuligni, Andrew J., and Allison Sidle Fuligni. 2007. “Immigrant
Families and the Education Development of Their
Children.” In Immigrant
Families and Contemporary Society, edited
by Kenneth A. Dodge and Martha Putallaz, 67–83. New York: The Guilford Press.
Godfrey, John J., Holliman, Edward C., and Jane McDaniel. 1992. “Switchboard:
Telephone speech corpus for research and
development”, in Proceedings
of IEEE ICASSP, vol.
1, 517–520. San Francisco: IEEE.
Grant, Rachel A., and Shelley D. Wong. 2003. “Barriers
to Literacy for Language-Minority Learners: An Argument for Change
in the Literacy Education
Profession.” Journal of Adolescent
& Adult
Literacy 46 (5): 386–394.
Gu, Jiatao, Hassan, Hany, Devlin, Jacob, and Victor O.K. Li. 2018. “Universal
Neural Machine Translation for Extremely Low Resource
Languages.” In Proceedings
of the North American Chapter of the Association for Computational
Linguistics: Human Language
Technology (NAACL/HLT), edited
by Marilyn Walker, Heng Ji, Amanda Stent, 344–354. New Orleans: Association for Computational Linguistics.
Ha, Thanh-L., Niehues, Jan, and Alexander Waibel. 2016. “Toward
Multilingual Neural Machine Translation with Universal Encoder and
Decoder”. In Proceedings
of the 13th International Workshop on Spoken Language
Translation (IWSLT
2016), edited
by Cettolo, Mauro Jan Niehues, Sebastian Stüker, Luisa Bentivoldi, Rolando Cattoni, and Marcello Federico, s.l. Seattle: IWSLT. [Online] Available
at [URL]
Hamon, Olivier, Mostefa, Djamel, and Khalid Choukri. 2007. “End-to-End
Evaluation of a Speech-to-Speech Translation System in
TC-STAR”. In Proceedings
of the MT Summit XI edited
by Bente Maegaard, s.l.. Copenhagen: MT Summit.
Hassan, Hany, Schwartz, Lee, Hakkani-Tür, Dilek, and Gokhan Tur. 2014. “Segmentation
and Disfluency Removal for Conversational Speech
Translation.” Proceedings of
INTERSPEECH 2014, edited
by Haizhou Li, Helen M. Meng, Bin Ma, Engsiong Cheng, and Lei Xie, 318–322. Singapore, ISCA. [Online] Available
at [URL][URL].
Hassan, Hany, Aue, Anthony, Chen, Chang, Chowdhary, Vishal, Clark, Jonathan, Federmann, Christian, Huang, Xuedong, Junczys-Dowmunt, Marcin, Lewis, William, Li, Mu, Liu, Shujie, Lie, Tie-Yan, Luo, Renqian, Menezes, Arul, Qin, Tao, Seide, Frank, Tan, Xu, Tian, Fei, Wu, Lijun, Wu, Shuangzhi, Xia, Yingce, Zhang, Dongdong, Zhang, Zhirui, and Ming Zhou. 2018. “Achieving
Human Parity on Automatic Chinese to English News
Translation”. arXiv
preprint. [Online] Available
at [URL].
Kolss, Muntsin, Wölfel, Matthias, Kraft, Florian, Niehues, Jan, Paulik, Matthias, and Alexander Waibel. 2008. “Simultaneous
German-English lecture
translation”. In Proceedings
of the Fifth International Workshop on Spoken Language
Translation (IWSLT
2008), edited
by Nicola Bertoldi, Madalina Barbaiani, Marcello Frederico, Rolando Cattoni, 174–181. Waikiki: IWSLT. [Online] Available
at [URL].
Läubli, Samuel, Castilho, Sheila, Neubig, Graham, Sennrich, Rico, Shen, Qinlan, and Antonio Toral. 2020. “A
Set of Recommendations for Assessing Human – Machine Parity in
Language Translation.” Journal of
Artificial Intelligence
Research 67: 653–672.
Levin, Lori S., Bartlog, Boris, Font Llitjós, Ariadna, Gates, Donna, Lavie, Alon, Wallace, Dorcas, Watanabe, Taro, and Monica Woszczyna. 2000. “Lessons
Learned from a Task-based Evaluation of Speech-to-Speech Machine
Translation”. In Proceedings
of the Second International Conference on Language Resources and
Evaluation (LREC’00), edited
by Maria Gavrilidou, George Carayannis, Stella Markantonatou, Stelios Piperidis and Gregory Stainhauer, s.l. Athens: LREC. [Online] Available
at [URL].
Lewis, William. 2015. “Skype
Translator: Breaking Down Language and Hearing
Barriers.” Translating and the
Computer
(TC37), 125–149.
Ma, Xin, Shen, Jianping, Krenn, Huilan, Hu, Shanshan, and Jing Yuan. 2016. “A
Meta-Analysis of the Relationship Between Learning Outcomes and
Parental Involvement During Early Childhood Education and Early
Elementary Education”. Educational
Psychology
Review 28 (4): 771–801.
Müller, Markus, Fünfer, Sarah, Stüker, Sebastian, and Alexander Waibel. 2016. “Evaluation
of the KIT Lecture Translation
System”. In Proceedings
of the Tenth International Conference on Language Resources and
Evaluation (LREC’16), edited
by Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis 1856–1861. Portoroz: LREC. [Online] Available
at [URL].
NCES (National Center for
Education
Statistics). 2018. “English
Language Learners in Public
Schools.” [Online] Available
at [URL]
Niehues, Jan, Nguyen, Thai-Son, Cho, Eunah, Ha, Thanh-Le, Kilgour, Kevin, Müller, Markus, Sperber, Matthias, Stüker, Sebastian, and Alexander Waibel. 2016. “Dynamic
Transcription for Low-latency Speech
Translation”. In Proceedings
of the 17th Annual Conference of the International Speech
Communication
Association (Interspeech
2016), edited
by Nelson Morgan, 2513–2517. San Francisco: ISCA. [Online] Available
at [URL].
Niehues, Jan, Cattoni, Ronaldo, Stüker, Sebastian, Cettolo, Mauro, Turchi, Mauro, and Marcello Federico. 2018. “The
IWSLT 2018 Evaluation
Campaign”. In Proceedings
of the 15th International Conference on Spoken Language
Translation, edited
by Marco Turchi, Jan Niehues, and Marcello Frederico, 2–6. Brussels: IWSLT. [Online] Available
at [URL]
Niehues, Jan, Cattoni, Ronaldo, Stüker, Sebastian, Negri, Matteo, Turchi, Marco, Ha, Thanh-Le, Salesky, Elizabeth, Sanabria, Ramon, Barrault, Loic, Specia, Lucia, and Marcello Federico. 2019. “The
IWSLT 2019 Evaluation
Campaign”. In Proceedings
of the 16th International Workshop on Spoken Language
Translation (IWSLT
2019), edited
by Jan Niehues, Rolando Cattoni, Sebastian Stüker, Matteo Negri, Marco Turchi, Thanh-Le Ha, Elizabeth Salesky, Ramon Sanabria, Loïc Barrault, Lucia Specia and Marcello Federico. Hong Kong: IWSLT [Online] Available
at [URL].
Nurminen, Mary. 2020. “Raw
Machine Translation Use by Patent Professionals. A Case of
Distributed Cognition.” Translation,
Cognition &
Behavior 3 (1): 100–121.
Papineni, Kishore, Roukos, Salim, Ward, Todd, and Wei-Jing Zhu. 2002. “BLEU:
a Method for Automatic Evaluation of Machine
Translation”. In Proceedings
of the 40th Annual Meeting of the Association for Computational
Linguistics, edited
by Pierre Isabelle, Eugene Charniak, and Dekang Lin, 311–318. Philadelphia: Association for Computational Linguistics. [Online] Available
at [URL].
Peitz, Stephan, Freitag, Markus, Mauser, Arne, and Hermann Ney. 2011. “Modeling
Punctuation Prediction as Machine
Translation”. In Proceedings
of the International Workshop on Spoken Language
Translation (IWSLT
2011), edited
by Marcello Frederico, Mei-Yuh Hwang, Margit Rödder, and Sebastian Stüker, 238–245. San Francisco: IWSLT. [Online] Available
at [URL].
Pham, Ngoc-Quan, Nguyen, Tuan Nam, Liu, Danni, Mullov, Carlos, Niehues, Jan, and Alexander Waibel, A. 2022. “Effective
combination of pretrained models –
KIT@IWSLT2022”. In Proceedings
of the 19th International Conference on Spoken Language
Translation (IWSLT
2022), edited
by Elizabeth Salesky, Marcello Frederico, and Marta Costa, 190–197. Dublin: IWSLT. [Online] Available
at [URL].
Rarrick, Spencer, Quirk, Chris, and William Lewis. 2011. “MT
detection in web-scraped parallel
corpora.” In Proceedings
of the 13th Machine Translation Summit (MT Summit
XIII), edited by Mauro Cettolo, Nicola Bertoldi, and Marcello Frederico. 422–430. Xiamen: MT Summit. [Online] Available
at [URL].
Setton, Robin, and Andrew Dawrant. 2016. Conference
Interpreting: A Complete
Course. Amsterdam: John Benjamins.
St. Clair, Lisa, Jackson, Barbara, and Rose Zweiback. 2012. “Six
Years Later: Effect of Family Involvement Training on the Language
Skills of Children from Migrant
Families”. School Community
Journal 22: 9–19.
Seide, Frank, Li, Gang, Chen, Xie, and Dong Yu. 2011. “Feature
Engineering in Context-Dependent Deep Neural Networks for
Conversational Speech
Transcription.” In Proceedings of
IEEE Workshop on Automatic Speech Recognition &
Understanding, 24–29 Waikoloa: IEEE.
Steinfeld, A. 1998. “The
Benefit of Real-Time Captioning in a Mainstream Classroom as
Measured by Working Memory.” Volta
Review 100 (1): 29–44.
Sutskever, Ilya, Vinyals, Oriol, and Quoc V. Le 2014. “Sequence
to Sequence Learning with Neural
Networks”. In Advances
in Neural Information Processing Systems 27: Annual Conference on
Neural Information Processing
Systems, edited
by Michael I. Jordan, Yann Lecun, and Sara A. Solla, 3104–3112. Cambridge: MIT Press.
Thomas, Wayne P, and Virginia P. Collier. 2002. A
National Study of School Effectiveness for Language Minority
Students’ Long-term Academic
Achievement. Santa Cruz, CA: Center for Research on Education, Diversity & Excellence.
Turovsky, Barak. 2016. Google
Translate Blog: Ten years of Google
Translate. [Online] Available
at [URL].
Tüske, Zoltán, Saon, George, and Brian Kingsbury. 2021. “On
the Limit of English Conversational Speech
Recognition.” In Proceedings
of INTERSPEECH
2021. 2062–2066. Brno: ICSA. [Online] Available
at [URL].
Valero-Garcés, Carmen. 2007. “Challenges
in Multilingual Societies. The Myth of the Invisible Interpreter and
Translator.” Language
Culture 8: 81–107.
Vanmassenhove, Eva, Hardmeier, Christian, and Andy Way. 2018. “Getting
Gender Right in Neural Machine
Translation”. In Proceedings
of the 2018 Conference on Empirical Methods in Natural Language
Processing, edited by Ellen Riloff, David Chiang, Julia Hockenmaier and Jun’ichi Tsujii, 3003–3008. [Online] Available
at [URL].
Waibel, Alexander, Jain, Ajai N., McNair, Arthur E., Saito, Hiroaki, Hauptmann, Alexander G., and Joe Tebelskis. 1991. “JANUS:
A speech-to-speech translation system using connectionist and
symbolic processing
strategies”. In Proceedings
of the Acoustics, Speech, and Signal Processing
(ICASSP-91), 793–796. Massachussets: IEEE.
Xiong, Wayne, Droppo, Jasha, Huang, Xuedong, Seide, Frank, Seltzer, Mike, Stolcke, Andreas, Yu, Dong, and Geoffrey Zweig. 2016. Achieving
Human Parity in Conversational Speech
Recognition. Microsoft Research
Technical Report
MSR-TR-2016-71. [Online] Available
at [URL].
. 2017. “Toward
Human Parity in Conversational Speech
Recognition.” IEEE/ACM Transactions
on Audio, Speech, and Language
Processing 25 (12): 2410–2423.
Yoshioka, Takuya, Dimitriadis, Dimitrios, Stolcke, Andreas, Hinthorn, William, Chen, Zhuo, Zeng, Michael, and Xuedong Huang. 2019. “Meeting
Transcription Using Asynchronous Distant
Microphones.” In Proceedings
of INTERSPEECH 2019, 20th Annual Conference of the International
Speech Communication
Association (ISCA), edited
by Gernot Kubin, and Zdravko Kacic, 2968–2972. Graz: ISCA. [Online] Available
at [URL].
Cited by (2)
Cited by two other publications
Fan, Damien Chiaming
2024. Conference interpreters’ technology readiness and perception of digital technologies. Interpreting. International Journal of Research and Practice in Interpreting 26:2 ► pp. 178 ff.
Pöchhacker, Franz & Minhua Liu
2024. Interpreting technologized. Interpreting. International Journal of Research and Practice in Interpreting 26:2 ► pp. 157 ff.
This list is based on CrossRef data as of 12 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
