Article In: Humans, Machines, and Embedded Translation
Edited by Sandra L. Halverson and Jean Nitzke
[Translation, Cognition & Behavior 8:2] 2025
How dependency-based syntactic complexity shapes post-editing of LLM-generated translations
Global and diagnostic evidence
This content is being prepared for publication; it may be subject to changes.
Abstract
This study examines how dependency-based syntactic complexity shapes English-to-Chinese post-editing of GPT-4 output. We address three questions: (RQ1) whether syntactic complexity on the source-text and GPT-output sides predicts post-editing quality; (RQ2) whether translator expertise moderates syntactic complexity effects; and (RQ3) whether task conditions affect performance and moderate syntactic complexity effects. We collected data from 46 participants (30 students, 16 professionals) post-editing in Trados Studio. We operationalized complexity with dependency-based metrics grounded in Incomplete Dependency Theory and Dependency Locality Theory, complemented by English–Chinese–specific indices (Left-Embeddedness; Nested Noun Distance). Results show that complexity in both source texts and GPT-4 output predicts post-editing errors (RQ1); experts are less sensitive to rising complexity than students, especially at higher complexity levels (RQ2); termbase access reduces overall and terminology errors, and the expert advantage in handling complexity is larger under Light post-editing but attenuated under Full post-editing (RQ3).
Article outline
- 1.Introduction
- 2.Methodology
- 2.1Selection of source texts
- 2.2Experimental environment
- 2.3Participants
- 2.4Data collection and processing
- 3.Quality assessment
- 3.1Overall automatic quality assessment
- 3.2Error annotation–based quality assessment
- 4.Dependency-based syntactic complexity metrics
- 4.1General syntactic complexity
- 4.2Language-pair–specific complexity
- 4.3Cross-lingual transformation magnitude
- 5.Results and discussion
- 5.1Baseline quality and error profile
- 5.2Syntactic complexity patterns
- 5.2.1Correlation analysis
- 5.2.2BCR analysis
- 5.3ST-side syntactic complexity and error-based quality outcomes
- 5.3.1IDT_DLT_ST as a predictor of overall error counts
- 5.3.2IDT_DLT_ST as a predictor of minor error counts
- 5.3.3English–Chinese–specific ST complexity (NND_ST) as a predictor of overall errors
- 5.4GPT-4 output complexity and PE quality
- 5.4.1General GPT-output complexity (IDT_DLT_GPT) and overall errors
- 5.4.2Terminology errors and nested-noun complexity in GPT output (NND_GPT)
- 5.5Task conditions and moderation of complexity effects
- 6.Conclusion
- Notes
- Author queries
References
References (35)
Al-Jabr, Abdul-Fattah. 2006. “Effect of Syntactic Complexity on Translating from/into English/Arabic.” Babel: International Journal of Translation/Revue Internationale de La Traduction/Revista Internacional de Traducción 52 (3). [URL]
Allassonnière-Tang, Marc, Ying-Chun Chen, Nai-Shing Yen, and One-Soon Her. 2021. “Investigating the Branching of Chinese Classifier Phrases: Evidence from Speech Perception and Production.” Journal of Chinese Linguistics 49 (1): 71–105.
Alonso, Elisa, and Lucas Nunes Vieira. 2020. “The Impact of Technology on the Role of the Translator in Globalized Production Workflows.” In The Routledge Handbook of Translation and Globalization. Routledge. [URL].
Carl, Michael, Moritz Schaeffer, and Srinivas Bangalore. 2016. “The CRITT Translation Process Research Database.” In New Directions in Empirical Translation Process Research, edited by Michael Carl, Srinivas Bangalore, and Moritz Schaeffer. New Frontiers in Translation Studies. Springer International Publishing.
De Marneffe, Marie-Catherine, Christopher D. Manning, Joakim Nivre, and Daniel Zeman. 2021. “Universal Dependencies.” Computational Linguistics 47 (2): 255–308.
Fleiss, Joseph L., Bruce Levin, and Myunghee Cho Paik. 1981. “The Measurement of Interrater Agreement.” Statistical Methods for Rates and Proportions 2 (212–236): 22–23.
Freitag, Markus, David Grangier, and Isaac Caswell. 2020. “BLEU might be guilty but references are not innocent.” arXiv preprint arXiv:2004.06063.
Germann, Ulrich. 2008. “Yawat: Yet Another Word Alignment Tool.” Proceedings of the ACL-08: HLT Demo Session, 20–23. [URL].
Gibson, Edward. 1998. “Linguistic Complexity: Locality of Syntactic Dependencies.” Cognition 68 (1): 1–76.
. 2000. “The Dependency Locality Theory: A Distance-Based Theory of Linguistic Complexity.” Image, Language, Brain 20001: 95–126.
Guerreiro, Nuno M., Ricardo Rei, Daan van Stigt, Luisa Coheur, Pierre Colombo, and André F. T. Martins. 2024. “xcomet: Transparent machine translation evaluation through fine-grained error detection.” Transactions of the Association for Computational Linguistics 12 (2024): 979–995.
Heilmann, Arndt. 2020. “Profiling Effects of Syntactic Complexity in Translation: A Multi-Method Approach.” PhD Thesis, Dissertation, Rheinisch-Westfälische Technische Hochschule Aachen. [URL]
Her, One-Soon, and Hui-Chin Tsai. 2020. “Left Is Right, Right Is Not: On the Constituency of the Classifier Phrase in Chinese.” Language and Linguistics 21 (1): 1–32.
Hvelplund, Kristian Tangsgaard. 2016. “Cognitive Efficiency in Translation.” In Benjamins Translation Library, edited by Ricardo Muñoz Martín, vol. 1281. John Benjamins Publishing Company.
ISO. 2017. “ISO 18587:2017.” Accessed August 1, 2025. [URL]
Koby, Geoffrey S. 2015. “The ATA Flowchart and Framework as a Differentiated Error-Marking Scale in Translation Teaching.” In Handbook of Research on Teaching Methods in Language Translation and Interpretation. IGI Global Scientific Publishing. [URL].
Koby, Geoffrey S., and Gertrud G. Champe. 2013. “Welcome to the Real World: Professional-Level Translator Certification.” Translation & Interpreting: The International Journal of Translation and Interpreting Research 5 (1): 156–73.
Kroon, Martin. 2022. “Towards the automatic detection of syntactic differences.” Netherlands Graduate School of Linguistics.
Li, Yen-Hui Audrey. 2014. “Structure of Noun Phrases: Left or Right.” Taiwan Journal of Linguistics 12 (2): 1–32.
Liu, Kanglong, and Muhammad Afzaal. 2021. “Syntactic Complexity in Translated and Non-Translated Texts: A Corpus-Based Study of Simplification.” Plos One 16 (6): e0253454.
Qi, Peng, Yuhao Zhang, Yuhui Zhang, Jason Bolton, and Christopher D. Manning. 2020. “Stanza: A Python Natural Language Processing Toolkit for Many Human Languages.” arXiv:2003.07082. Preprint, arXiv, April 23.
Saeedi, Ali, and Longhui Zou. 2022. “The Effect of Orthography and Language Orientation on Translation Effort.” Book of Abstracts, 1511. Accessed August 1, 2025. [URL]
Sun, Sanjun, and Gregory M. Shreve. 2014. “Measuring Translation Difficulty: An Empirical Study.” Target. International Journal of Translation Studies 26 (1): 98–127.
Sweller, John. 1988. “Cognitive Load during Problem Solving: Effects on Learning.” Cognitive Science 12 (2): 257–85.
Vanroy, Bram, Moritz Schaeffer, and Lieve Macken. 2021. “Comparing the effect of product-based metrics on the translation process.” Frontiers in Psychology 12 (2021): 681945.
Vanroy, Bram, Orphée De Clercq, Arda Tezcan, Joke Daems, and Lieve Macken. 2021. “Metrics of syntactic equivalence to assess translation difficulty.” In Explorations in empirical translation process research, pp. 259–294. Cham: Springer International Publishing.
Vieira, Lucas Nunes. 2016. Cognitive Effort in Post-Editing of Machine Translation: Evidence from Eye Movements, Subjective Ratings, and Think-Aloud Protocols. [URL]
Yamada, Masaru, Takanori Mizowaki, Longhui Zou, and Michael Carl. 2022. “Trados-to-Translog-II: Adding Gaze and Qualitivity Data to the CRITT TPR-DB.” Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, 295–96. [URL]
Zou, Longhui. 2024. “Cognitive Processes in Human-ChatGPT Interaction during Machine Translation Post-Editing.” PhD Thesis, Kent State University. [URL]
Zou, Longhui, Ali Saeedi, and Geoffrey S. Koby. 2025. “Beyond automated metrics: Assessing GPT-4o and Google Translate against professional translation standards.” SKASE Journal of Translation and Interpretation 18, no. 2 (2025): 165–87. Accessed December 1, 2025.
Zou, Longhui, and Michael Carl. 2022. “Trados and the critt tpr-db: Translation process research in an ecologically valid environment.” In Model building in empirical translation studies: Proceedings of TRICKLET Conference, pp. 38–40.
Zou, Longhui, Li Ke, Joshua Lamerton, and Mehdi Mirzapour. 2025. “GenAIese-A Comprehensive Comparison of GPT-4o and DeepSeek-V3 for English-to-Chinese Academic Translation.” In Proceedings of the Eleventh Workshop on Patent and Scientific Literature Translation (PSLT 2025), pp. 1–12.
Zou, Longhui, Michael Carl, and Devin Gilbert. 2023. “Integrating Trados-Qualitivity Data to the CRITT TPR-DB: Measuring Post-Editing Process Data in an Ecologically Valid Setting.” In Corpora and Translation Education, edited by Jun Pan and Sara Laviosa. New Frontiers in Translation Studies. Springer Nature Singapore.
Zou, Longhui, Michael Carl, and Jia Feng. 2025. “Cognitive Processes of Post-Editing Generative AI: Examining Student Translators’ Interactions with ChatGPT Outputs.” In Translation Studies in the Age of Artificial Intelligence. Routledge. Accessed August 1, 2025. [URL].
Zou, Longhui, Michael Carl, Mehdi Mirzapour, Hélène Jacquenet, and Lucas Nunes Vieira. 2022. “AI-Based Syntactic Complexity Metrics and Sight Interpreting Performance.” In Intelligent Human Computer Interaction, edited by Jong-Hoon Kim, Madhusudan Singh, Javed Khan, Uma Shanker Tiwary, Marigankar Sur, and Dhananjay Singh, vol. 131841. Lecture Notes in Computer Science. Springer International Publishing.
