In:Recent Advances in Multiword Units in Machine Translation and Translation Technology
Edited by Johanna Monti, Gloria Corpas Pastor, Ruslan Mitkov and Carlos Manuel Hidalgo-Ternero
[Current Issues in Linguistic Theory 366] 2024
► pp. 40–56
Chapter 3Evaluating the Italian-English machine translation quality of MWUs in the domain of archaeology
Published online: 7 November 2024
https://doi.org/10.1075/cilt.366.03spe
https://doi.org/10.1075/cilt.366.03spe
Abstract
Multiword units (MWUs) represent a challenging and problematic linguistic issue in the field of
Natural Language Processing (NLP) due to their idiosyncratic nature. This paper investigates the quality of Neural
Machine Translation (NMT) outputs when dealing with MWUs in the domain of archaeology. As a case study, a dataset of
100 MWUs is used as a Gold Standard to evaluate out-of-context and in-context translation outputs from three
state-of-the-art NMT systems for the Italian-English language pair: Google Translate, DeepL, and Microsoft Bing
Translator. MT outputs are manually evaluated with reference to the Gold Standard, namely out-of-context and
in-context human English translations of the selected 100 MWUs. Results show that terminology is still a problematic
category for MT quality and that MWUs translation may vary, and sometimes even improve, when further context is
provided.
Keywords: multiword units, terminology, machine translation, evaluation, error analysis, archaeology
Article outline
- 1.Introduction
- 2.Related work
- 2.1Terminology translation
- 2.2Terminology translation evaluation
- 3.Experimental setup
- 3.1The dataset
- 3.2The evaluation methodology
- 3.3Global evaluation
- 3.4Local error analysis
- 4.Conclusions
Notes References
References (26)
Arcan, M., Torregrosa, D., & Buitelaar, P. (2017). Translating
terminological expressions in knowledge bases with neural machine
translation. arXiv preprint arXiv:1709.02184.
Arcan, M., Turchi, M., Tonelli, S., & Buitelaar, P. (2014). Enhancing
statistical machine translation with bilingual terminology in a CAT
environment. Proceedings of the 11th Biennial Conference of the
Association for Machine Translation in the Americas (AMTA
2014) (pp. 54–68). Association for Machine Translation in the Americas.
Chatterjee, R., Negri, M., Turchi, M., Federico, M., Specia, L., & Blain, F. (2017, September). Guiding
neural machine translation decoding with external
knowledge. Proceedings of the Second Conference on Machine
Translation (pp. 157–168).
Chen, L. H., & Kageura, K. (2019). Translating
terminologies: A comparative examination of NMT and PBSMT
systems. Proceedings of Machine Translation Summit XVII Volume 2:
Translator, Project and User
Tracks (pp. 101–108).
Dinu, G., Mathur, P., Federico, M., & Al-Onaizan, Y. (2019). Training
neural machine translation to apply terminology constraints. arXiv preprint
arXiv:1906.01105.
Fadaee, M., Bisazza, A., & Monz, C. (2017). Data
augmentation for low-resource neural machine translation. arXiv preprint
arXiv:1705.00440.
Farajian, M. A., Bertoldi, N., Negri, M., Turchi, M., & Federico, M. (2018). Evaluation
of terminology translation in instance-based neural MT
adaptation. Proceedings of the 21st Annual Conference of the
European Association for Machine Translation (EAMT 2018).
Haque, R., Hasanuzzaman, M., & Way, A. (2019a). Investigating
terminology translation in statistical and neural machine translation: A case study on English-to-Hindi and
Hindi-to-English. Proceedings of the International Conference on
Recent Advances in Natural Language Processing (RANLP
2019) (pp. 437–446).
(2019b). TermEval:
An automatic metric for evaluating terminology translation in
MT. The 20th International Conference on Computational Linguistics
and Intelligent Text Processing (CICLing 2019), La Rochelle,
France.
Hassan, H., Aue, A., Chen, C., Chowdhary, V., Clark, J., Federmann, C., Huang, X., Junczys-Dowmunt, M., Lewis, W., Li, M., Liu, S., Liu, T. Y, Luo, R., Menezes, R., Qin, T., Seide, F., Tan, X., Tian, F., Wu. L., Wu S., Xia, Y., Zhang, D., Zhang, Z., Zhou, Z., (2018). Achieving
human parity on automatic Chinese to English news translation. arXiv preprint
arXiv:1803.05567.
Hayakawa, T., & Arase, Y. (2020). Fine-Grained
error analysis on English-to-Japanese machine translation in the medical
domain. Proceedings of the 22nd Annual Conference of the
European Association for Machine
Translation (pp. 155–164).
Isabelle, P., Cherry, C., & Foster, G. (2017). A
Challenge Set Approach to Evaluating Machine Translation. arXiv preprint
arXiv:1704.07431.
Lommel, A., & Melby, A. K. (2018). Tutorial:
MQM-DQF: A good marriage (Translation quality for the 21st
Century). Proceedings of the 13th Conference of the Association
for Machine Translation in the Americas (Volume 2: User
Papers).
Macketanz, V., Avramidis, E., Burchardt, A., Helcl, J., & Srivastava, A. (2017). Machine
translation: Phrase-Based, rule-based and neural approaches with linguistic
evaluation. Cybernetics and Information
Technologies, 17(2), 28–43.
Michon, Elise, Josep Crego, & Jean Senellart (2020). Integrating
domain terminology into neural machine translation. Proceedings of
the 28th International Conference on Computational Linguistics. Barcelona, Spain (Online): International
Committee on Computational
Linguistics (pp. 3925–3937).
Monti, J., Barreiro, A., Elia, A., Marano, F., & Napoli, A. (2011). Taking
on new challenges in multi-word unit processing for machine translation. Second
International Workshop on Free/Open-Source Rule-Based Machine
Translation (pp. 11–19). UOC. EDU.
Monti, J., Barreiro, A., Oroliac, B., & Batista, F. (2013). When
multiwords go bad in machine translation. Machine Translation Summit
XIV (pp. 26–33). The European Association for Machine Translation.
Monti, J., Mitkov, R., Pastor, G. C., & Seretan, V. (Eds.). (2018). Multiword
units in machine translation and translation technology (Vol.
341). John Benjamins Publishing Company.
Peng, W., Huang, C., Li, T., Chen, Y., & Liu, Q. (2020). Dictionary-Based
data augmentation for cross-domain neural machine translation. arXiv preprint
arXiv:2004.02577.
Ren, Z., Lü, Y., Cao, J., Liu, Q., & Huang, Y. (2009). Improving
statistical machine translation using domain bilingual multiword
expressions. Proceedings of the Workshop on Multiword Expressions:
Identification, Interpretation, Disambiguation and Applications (MWE
2009) (pp. 47–54).
Rikters, M., & Bojar, O. (2017). Paying
attention to multi-word expressions in neural machine translation. arXiv
preprint arXiv:1710.06313.
Sag, I. A., Baldwin, T., Bond, F., Copestake, A., & Flickinger, D. (2002). Multiword
expressions: A pain in the neck for
NLP. In International Conference on
Intelligent Text Processing and Computational
Linguistics (pp. 1–15). Springer.
Scansani, R., Bentivogli, L., Bernardini, S., & Ferraresi, A. (2019). MAGMATic:
A multi-domain academic gold standard with manual annotation of terminology for machine translation
evaluation. Proceedings of Machine Translation Summit XVII Volume 1:
Research
Track (pp. 78–86).
Thompson, B., Knowles, R., Zhang, X., Khayrallah, H., Duh, K., & Koehn, P. (2019). HABLex:
Human annotated bilingual lexicons for experiments in machine
translation. Proceedings of the 2019 Conference on Empirical
Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language
Processing
(EMNLP-IJCNLP) (pp. 1382–1387).
Vintar, Ŝ. (2018). Terminology
translation accuracy in statistical versus neural MT: An evaluation for the English-Slovene language
pair. Proceedings of the Eleventh International Conference on
Language Resources and Evaluation (LREC 2018).
Zaninello, A., & Birch, A. (2020,). Multiword
expression aware neural machine translation. Proceedings of The 12th
Language Resources and Evaluation
Conference (pp. 3816–3825).
