Article published In: Translation Spaces
Vol. 6:2 (2017) ► pp.291–309
Translation and techniques
Making sense of neural machine translation
Published online: 4 December 2017
https://doi.org/10.1075/ts.6.2.06for
https://doi.org/10.1075/ts.6.2.06for
Abstract
The last few years have witnessed a surge in the interest of a new machine translation paradigm: neural machine translation (NMT). Neural machine translation is starting to displace its corpus-based predecessor, statistical machine translation (SMT). In this paper, I introduce NMT, and explain in detail, without the mathematical complexity, how neural machine translation systems work, how they are trained, and their main differences with SMT systems. The paper will try to decipher NMT jargon such as “distributed representations”, “deep learning”, “word embeddings”, “vectors”, “layers”, “weights”, “encoder”, “decoder”, and “attention”, and build upon these concepts, so that individual translators and professionals working for the translation industry as well as students and academics in translation studies can make sense of this new technology and know what to expect from it. Aspects such as how NMT output differs from SMT, and the hardware and software requirements of NMT, both at training time and at run time, on the translation industry, will be discussed.
Article outline
- 1.Introduction
- 2.What is neural machine translation and how does it work?
- 2.1Neural machine translation is corpus-based machine translation
- 2.2Neural machine translation uses neural networks
- 2.2.1Neural units or neurons
- 2.2.2Grouping units into layers to learn distributed representations
- 2.3How does neural machine translation work?
- 2.3.1Training
- 2.3.2Machine translation as predicting the next word
- 2.3.3Representations for words and for longer segments of text
- 2.3.4Encoding
- 2.3.5Decoding
- 2.4Extensions and alternative neural machine translation architectures
- 2.4.1Attention
- 2.4.2“Convolutional” neural machine translation
- 2.4.3Doing away with recursion and convolution: is attention all you need?
- 2.5Main differences between neural and statistical machine translation
- 3.What can translators expect from neural machine translation?
- 3.1High computational requirements
- 3.2A different kind of output
- 3.3Is neural machine translation better than statistical machine translation?
- 3.3.1Automatic evaluation
- 3.3.2Subjective evaluation
- 3.3.3Measuring post-editing effort and productivity
- 4.Concluding remarks
- Acknowledgements
- Notes
References
References (28)
Arthur, Philip, Graham Neubig, and Satoshi Nakamura. 2016. “Incorporating Discrete Translation Lexicons into Neural Machine Translation.” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (Austin, Texas, November 1–5, 2016). 1557–1567.
Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. 2014. “Neural Machine Translation by Jointly Learning to Align and Translate”, eprint arXiv:1409.0473 ([URL]).
Bentivogli, Luisa, Arianna Bisazza, Mauro Cettolo, and Marcello Federico. 2016. “Neural versus Phrase-Based Machine Translation Quality: A Case Study.” in Proceedings of Conference on Empirical Methods in Natural Language Processing. EMNLP: Texas (USA). 257–267. (eprint arXiv:1608.04631 [URL]).
Bojar, Ondrej, Rajen Chatterjee, Christian Federmann, Yvette Graham, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Philipp Koehn, Varvara Logacheva, Christof Monz, Matteo Negri, Aurélie Névéol, Mariana Neves, Martin Popel, Matt Post, Raphael Rubino, Carolina Scarton, Lucia Specia, Marco Turchi, Karin Verspoor, and Marcos Zampieri. 2016. “Findings of the 2016 Conference on Machine Translation.” in Proceedings of the First Conference on Machine Translation (Berlin, Germany, August). 131–198.
Castilho, Sheila, Joss Moorkens, Federico Gaspari, Iacer Calixto, John Tinsley, and Andy Way. 2017. “Is Neural Machine Translation the New State of the Art?” Prague Bulletin of Mathematical Linguistics 108(1): 109–120.
Cho, Kyunghyun, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. “On the Properties of Neural Machine Translation: Encoder-Decoder Approaches.” eprint arXiv:1409.1259 ([URL]).
Forcada, Mikel L., and Ramón P. Ñeco. 1997. “Recursive hetero-associative memories for translation” in Biological and Artificial Computation: From Neuroscience to Technology (International Work-Conference on Artificial and Natural Neural Networks, IWANN’97 Lanzarote, Canary Islands, Spain, June 4–6, 1997, Proceedings), edited by José Mira, Roberto Moreno-Díaz, and Joan Cabestany. Heidelberg: Springer. 453–462.
Forcada, Mikel L. 2010. “Machine Translation Today”, in Handbook of Translation Studies, edited by Yves Gambier, Luc Van Doorslaer. vol. 11, 215–223.
Foster, George, Pierre Isabelle, and Pierre Plamondon. 1997. “Target-Text Mediated Interactive Machine Translation.” Machine Translation 12(1). 175–194.
Gehring, Jonas, Michael Auli, David Grangier, Denis Yarats, and Yann N. Dauphin. 2017. “Convolutional Sequence to Sequence Learning.” eprint arXiv:1705.03122 (eprint arXiv: 1705.03122 [URL]).
Hearne, Mary, and Andy Way. 2011. “Statistical Machine Translation: A Guide for Linguists and Translators.” Language and Linguistics Compass 5(5). 205–226.
Hochreiter, Sepp, and Jürgen Schmidhuber. 1997. “Long short-term memory.” Neural Computation 9(8).1735–1780.
Levin, Pavel, Nishikant Dhanuka, and Maxim Khalilov. 2017. “Machine Translation at Booking.com: Journey and Lessons Learned.” in The 20th Annual Conference of the European Association for Machine Translation (29–31 May 2017, Prague, Czech Republic): Conference Booklet, User Studies and Project/Product Descriptions. 81–86.
Mikolov, Tomas, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013a. “Efficient Estimation of Word Representations in Vector Space.” in Proceedings of the International Conference on Learning Representations (also available as eprint arXiv: 1301.3781 [URL]).
Mikolov, Tomas, Wen-tau Yih, and Geoffrey Zweig. 2013b. “Linguistic Regularities in Continuous Space Word Representations.” in Proceedings of NAACL-HLT 2013 (Atlanta, Georgia, 9–14 June 2013), 746–751.
Papineni, Kishore, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. “BLEU: A Method for Automatic Evaluation of Machine Translation.” Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. 311–318.
Peris, Álvaro, Miguel Domingo, and Francisco Casacuberta. 2017. “Interactive Neural Machine Translation.” Computer Speech and Language 451, 201–220.
Sennrich, Rico, Barry Haddow, and Alexandra Birch. 2016 “Neural Machine Translation of Rare Words with Subword Units.” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. 1715–1725 (Also eprint arXiv: 1508.07909: [URL]).
Sennrich, Rico, Orhan Firat, Kyunghyun Cho, Alexandra Birch, Barry Haddow, Julian Hitschler, Marcin Junczys-Dowmunt, Samuel Läubli, Antonio Valerio Miceli Barone, Jozef Mokry, and Maria Nădejde. 2017. “Nematus: A Toolkit for Neural Machine Translation” eprint arXiv:1703.04357 ([URL]).
Shterionov, Dimitar, Pat Nagle, Laura Casanellas, Riccardo Superbo, and Tony O’Dowd. 2017. “Empirical Evaluation of NMT and PBSMT Quality for Large-Scale Translation Production” in The 20th Annual Conference of the European Association for Machine Translation (29–31 May 2017, Prague, Czech Republic): Conference Booklet, User Studies and Project/Product Descriptions. 75–80.
Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. 2014. “Sequence to Sequence Learning with Neural Networks”, in Advances in Neural Information Processing Systems, edited by Zoubin Ghahramani, Max Welling, Corinna Cortes, Neil D. Lawrence, and Kilian Q. Weinberger. p. 3104–3112.
Toral, Antonio, and Víctor M. Sánchez-Cartagena. 2017. “A Multifaceted Evaluation of Neural versus Phrase-Based Machine Translation for 9 Language Directions” in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (Valencia, Spain, April 3–7, 2017), Volume 1, Long Papers. 1063–1073.
Vashee, Kirti. 2016. “The Google Neural Machine Translation Marketing Deception”, [URL]
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. “Attention is all you need.” eprint arXiv:1706.03762 ([URL]).
Way, Andy, and Mary Hearne. 2011. “On the Role of Translations in State-of-the-Art Statistical Machine Translation.” Language and Linguistics Compass 5:51, 227–248.
Wu, Yonghui, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2017 “Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation”, eprint arXiv:1609.08144 ([URL]).
Cited by (80)
Cited by 80 other publications
Absolon, Jakub
Cao, Xiaoyan
Farghal, Tariq, Khetam Shraideh & Ahmad M. Al-Omari
Hernández Fresno, Elena & María Teresa Ortego Antón
Jiménez-Crespo, Miguel A.
Kumshe, Umar Muhammad Mustapha & Cao Jing
Lo, Siowai
Lo, Siowai
Medina-Muñoz, Juan Pablo, Sebastián Arias-Fonseca, Ángel Saúl Díaz-Téllez & Jennifer Mejía-Ríos
Noble, Stephen A.
Qumar, Syed Matla Ul, Muzaffar Azim & S. M. K. Quadri
Rosa-Sorlozano, Carmen & Miguel Ángel Candel-Mora
2025. Machine translation of tourism reviews. Translation and Translanguaging in Multilingual Contexts 11:1 ► pp. 48 ff.
Saeedi, Farzan, Ghaniya Al Hinai, Khoula Al Kharusi & Abdulrahman AAl Abdulsalam
Shafayat, Sheikh, Dongkeun Yoon, Jiwoo Choi, Woori Jang & Seohyon Jung
Vahora, Asif Mahamadbhai
Çetiner, Caner
ÇETİNER, Caner
Alwazna, Rafat Y.
Asscher, Omri
Calvo-Ferrer, José Ramón
Constantin, Ioana
Greńczuk, Andrzej, Iwona Chomiak-Orsa & Katarzyna Tryczyńska
Hulley, Bartholomew
Ke, Jun
Kim, Joosung, Soo Hyun Kim & Inwhee Joe
Leeson, Lorraine, Sara Morrissey, Dimitar Shterionov, Daniel Stein, Henk van den Heuvel & Andy Way
Li, Xue
Liu, Yiguang & Junying Liang
Sanz-Valdivieso, Lucía
Sargın, Meltem
Asscher, Omri & Ella Glikson
Ehrensberger-Dow, Maureen, Alice Delorme Benites & Caroline Lehr
Forcada, Mikel L.
Hongtao, Wang
2023. Defending the last bastion. Babel. Revue internationale de la traduction / International Journal of Translation
Killman, Jeffrey
2023. Machine translation and legal terminology. In Handbook of Terminology [Handbook of Terminology, 3], ► pp. 485 ff.
Killman, Jeffrey
Killman, Jeffrey
Klimova, Blanka, Marcel Pikhart, Alice Delorme Benites, Caroline Lehr & Christina Sanchez-Stockhammer
Lankford, Séamus, Haithem Afli & Andy Way
Lee, Jieun & Hyoeun Choi
2023. A quality assessment of Korean–English patent machine translation. FORUM. Revue internationale d’interprétation et de traduction / International Journal of Interpretation and Translation 21:2 ► pp. 236 ff.
Lohar, Pintu, Guodong Xie, Daniel Gallagher & Andy Way
Riemland, Matt
2023. Theorizing sustainable, low-resource MT in development settings. Translation Spaces 12:2 ► pp. 231 ff.
Tonja, Atnafu Lambebo, Olga Kolesnikova, Alexander Gelbukh & Grigori Sidorov
Almanna, Ali & Rafik Jamoussi
Apodaka, Eduardo, Asier Amezaga & Auxkin Galarraga
Bedecho, Aklilu Thomas & Michael Melese Woldeyohannis
Bowker, Lynne & Frédéric Blain
2022. When French becomes Canadian French. The Journal of Internationalization and Localization 9:1 ► pp. 1 ff.
Cennamo, Ilaria & Loïc de Faria Pires
2022. Intelligence artificielle et traduction. FORUM. Revue internationale d’interprétation et de traduction / International Journal of Interpretation and Translation 20:2 ► pp. 333 ff.
Li, Congli & Zhiguo Qu
Li, Jingyun & Kuruva Lakshmanna
Paulsen Christensen, Tina, Kristine Bundgaard, Anne Schjoldager & Helle Dam Jensen
Ragni, Valentina & Lucas Nunes Vieira
Rodríguez Vázquez, Silvia, Abigail Kaplan, Pierrette Bouillon, Cornelia Griebel & Razieh Azari
Sakamoto, Akiko
Steigerwald, Emma, Valeria Ramírez-Castañeda, Débora Y C Brandt, András Báldi, Julie Teresa Shapiro, Lynne Bowker & Rebecca D Tarvin
Sánchez-Gijón, Pilar
2022. Neural machine translation and the indivisibility of culture and language. FORUM. Revue internationale d’interprétation et de traduction / International Journal of Interpretation and Translation 20:2 ► pp. 357 ff.
Tonja, Atnafu Lambebo, Olga Kolesnikova, Muhammad Arif, Alexander Gelbukh & Grigori Sidorov
Zhao, Xiaoda, Xiaoyan Jin & Naeem Jan
Briva-Iglesias, Vicent
Delorme Benites, Alice, Sara Cotelli Kureth, Caroline Lehr & Elizabeth Steele
do Carmo, Félix
Haddow, Barry, Alexandra Birch & Kenneth Heafield
Matos Veliz, Claudia, Orphée De Clercq & Veronique Hoste
Munkova, Dasa, Michal Munk, Ľubomír Benko, Jiri Stastny & Wen-Long Shang
Yigezu, Mesay Gemeda, Michael Melese Woldeyohannis & Atnafu Lambebo Tonja
Bowker, Lynne
Bowker, Lynne
Bowker, Lynne
Cid, Clara Ginovart, Carme Colominas & Antoni Oliver
2020. Language industry views on the profile of the post-editor. Translation Spaces 9:2 ► pp. 283 ff.
de Faria Pires, Loïc
2020. Master’s students’ post-editing perception and
strategies. FORUM. Revue internationale d’interprétation et de traduction / International Journal of Interpretation and Translation 18:1 ► pp. 26 ff.
O’Brien, Sharon
O’Brien, Sharon
2021. Post-editing. In Handbook of Translation Studies [Handbook of Translation Studies, 5], ► pp. 177 ff.
O’Brien, Sharon & Maureen Ehrensberger-Dow
Melby, Alan K. & Daryl R. Hague
2019. A singular(ity) preoccupation. In The Evolving Curriculum in Interpreter and Translator Education [American Translators Association Scholarly Monograph Series, XIX], ► pp. 205 ff.
Wiesmann, Eva
Karimova, Sariya, Patrick Simianer & Stefan Riezler
Moorkens, Joss
Moorkens, Joss, Antonio Toral, Sheila Castilho & Andy Way
2018. Translators’ perceptions of literary post-editing using statistical and neural machine translation. Translation Spaces 7:2 ► pp. 240 ff.
[no author supplied]
This list is based on CrossRef data as of 6 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
