Modeling rater cognition in translation assessment: An exploratory investigation based on think-aloud, eye-tracking, and interview data

Chao Han, Shirong Chen and Jia Feng

National University of Singapore | Xiamen University | Renmin University of China

In this exploratory study, we investigated rater cognition in English–Chinese translation assessment, drawing on think-aloud, eye-tracking, and interview data. We designed a 3 × 2 × 2 experiment in which experienced raters assessed eighteen renditions of three levels of quality for each translation direction, using a Likert-type scale or analytic rubric scale. We found that: (a) the raters heeded meaning transfer more frequently than other contents; (b) they utilized a variety of processing actions, but a core subset involving eight actions constituted the mainstay; (c) to make a scoring decision, the raters mainly consulted the source text, the target texts, and the rating scale, but also displayed other patterns of interaction (e.g., relying on target texts only); (d) they fixated more frequently per time unit and proportionally longer on the target texts; and (e) translation direction and scoring method seemed to have modulated rater cognition. The implications of these findings for translation assessment are discussed.

Keywords:

Publication history

Date received: 25 February 2023

Date accepted: 5 August 2025

Published online: 11 September 2025

Table of contents

Abstract
Keywords
1.Introduction
2.Literature review
3.Method
4.Results
5.Discussion
- 5.1Discussion of research results
- 5.2Pedagogical and practical implications
6.Conclusion
Notes
Funding
References
Address for correspondence

The quality of translation and interpreting (T&I) is a perennial topic in T&I practice, education, and research, attracting substantial and sustained scholarly attention over the years. The extant literature tends to accentuate T&I quality as a measurable property of T&I product, and has therefore been focused on its theorization and modeling (House 1997; Pöchhacker 2002; Grbić 2008) as well as development and validation of measurement methods (Eyckmans and Anckaert 2017; Gieshoff and Albl-Mikasa 2022; Han and Shang 2023). Although this growing body of research addresses an important topic in T&I studies (i.e., assessment of T&I quality) and numerous empirical studies have examined the psychometric properties of T&I quality ratings (e.g., Lai 2011; Han and Zhao 2021; Chen, Yang, and Han 2022), the existing scholarship has unfortunately overlooked the process whereby T&I quality is perceived, evaluated, and constructed by human raters and assessors. In other words, there is a genuine need to investigate the cognitive processing involved in the rater-mediated assessment of T&I. This observation generally resonates with Kruger (2013) and Kruger and Kruger (2017) who explicitly call for rigorous research into cognition in the reception of translation. Another important reason for initiating such research — rater cognition in T&I assessment (or more broadly speaking, cognition in T&I reception) — is to enrich and extend the scope of cognitive T&I studies, a vibrant and fast-growing field which has traditionally concentrated on the cognition of translators and interpreters as producers of T&I (see Muñoz 2010; Jakobsen 2017), but paid little attention to the cognition of readers and listeners as receivers of T&I (for exceptions, see Kruger 2013; Walker 2019) and, in our case, the cognition of human raters as assessors of T&I (Han et al. 2024). We argue that an inclusive cognitive theory of T&I should shed insights into cognitive processes concerning both T&I production and reception (with the latter potentially involving T&I assessment) (see also Kruger and Kruger 2017).

References

Angelelli, Claudia V.

2009 “Using a Rubric to Assess Translation Ability: Defining the Construct.” In Testing and Assessment in Translation and Interpreting Studies, edited by Claudia V. Angelelli and Holly E. Jacobson, 13–47. Amsterdam: John Benjamins.

Baker, Beverly Anne

2012 “Individual Differences in Rater Decision-Making Style: An Exploratory Mixed-Methods Study.” Language Assessment Quarterly 9 (3): 225–248.

Barkaoui, Khaled

2007 “Rating Scale Impact on EFL Essay Marking: A Mixed-Method Study.” Assessing Writing 12 (2): 86–107.

2010 “Variability in ESL Essay Rating Processes: The Role of the Rating Scale and Rater Experience.” Language Assessment Quarterly 7 (1): 54–74.

Chen, Jing, Huabo Yang, and Chao Han

2022 “Holistic Versus Analytic Scoring of Spoken-Language Interpreting: A Multi-Perspectival Comparative Analysis.” The Interpreter and Translator Trainer 16 (4): 558–576.

Chesterman, Andrew

1998 “Causes, Translations, Effects.” Target 10 (2): 201–230.

Crisp, Victoria

2010 “Towards a Model of the Judgement Processes Involved in Examination Marking.” Oxford Review of Education 36 (1): 1–21.

Cumming, Alister

1990 “Expertise in Evaluating Second Language Compositions.” Language Testing 7 (1): 31–51.

Cumming, Alister, Robert Kantor, and Donald E. Powers

2002 “Decision Making While Rating ESL/EFL Writing Tasks: A Descriptive Framework.” The Modern Language Journal 86 (1): 67–96.

DeRemer, Mary L.

1998 “Writing Assessment: Raters’ Elaboration of the Rating Task.” Assessing Writing 5 (1): 7–29.

Eyckmans, June, and Philippe Anckaert

2017 “Item-Based Assessment of Translation Competence: Chimera of Objectivity versus Prospect of Reliable Measurement.” Linguistica Antverpiensia, New Series: Themes in Translation Studies 16: 40–56.

Feng, Jia

2018 中英双向互译中翻译认知过程研究: 基于眼动追踪和键盘记录的实证分析 [ Cognitive processing in bidirectional Chinese-English translation: Empirical evidence from eye-tracking and keystroke logging ]. Beijing: Foreign Language Teaching and Research Press.

Gieshoff, Anne Catherine, and Michaela Albl-Mikasa

2022 “Interpreting Accuracy Revisited: A Refined Approach to Interpreting Performance Analysis.” Perspectives 32 (2): 210–228.

Godfroid, Aline

2020 Eye Tracking in Second Language Acquisition and Bilinguialism: A Research Synthesis and Methodological Guide. New York: Routledge.

Grbić, Nadja

2008 “Constructing Interpreting Quality.” Interpreting 10 (2): 232–257.

Han, Chao, and Xiao Zhao

2021 “Accuracy of Peer Ratings on the Quality of Spoken-Language Interpreting.” Assessment and Evaluation in Higher Education 46 (8): 1299–1313.

Han, Chao, and Xiaoqi Shang

2023 “An Item-Based, Rasch-Calibrated Approach to Assessing Translation Quality.” Target 35 (1): 63–96.

Han, Chao, Binghan Zheng, Mingqing Xie, and Shirong Chen

2024 “Raters’ Scoring Process in Assessment of Interpreting: An Empirical Study Based on Eye Tracking and Retrospective Verbalization.” Interpreter and Translator Trainer 18 (3): 400–422.

Han, Chao, Rui Xiao, and Wei Su

2021 “Assessing the Fidelity of Consecutive Interpreting: The Effects of Using Source Versus Target Text as the Reference Material.” Interpreting 23 (2): 245–268.

Han, Chao

2020 “Translation Quality Assessment: A Critical Methodological Review.” The Translator 26 (3): 257–273.

Holmqvist, Kenneth, Saga Lee Örbom, Ignace T. C. Hooge, Diederick C. Niehorster, Robert G. Alexander, Richard Andersson, Jeroen S. Benjamins, et al.

2023 “Eye Tracking: Empirical Foundations for a Minimal Reporting Guideline.” Behavior Research Methods 55: 364–416.

House, Juliane

1997 Translation Quality Assessment: A Model Revisited. Tübingen: Gunter Narr Verlag.

Huertas Barros, Elsa, and Juliet Vine

2018 “Current Trends on MA Translation Courses in the UK: Changing Assessment Practices on Core Translation Modules.” Interpreter and Translator Trainer 12 (1): 5–24.

Hurtado Albir, Amparo ed.

2017 Researching Translation Competence by PACTE Group. Amsterdam: John Benjamins.

Jakobsen, Arnt Lykke

2017 “Translation Process Research.” The Handbook of Translation and Cognition, edited by John W. Schwieter and Aline Ferreira, 19–49. Hoboken: Wiley-Blackwell.

Koby, Geoffrey S.

2015 “The ATA Flowchart and Framework as a Differentiated Error-Marking Scale in Translation Teaching.” In Handbook of Research on Teaching Methods in Language Translation and Interpretation, edited by Ying Cui and Wei Zhao, 220–253. Hershey: IGI Global.

Kruger, Haidee

2013 “Child and Adult Readers’ Processing of Foreignised Elements in Translated South African Picturebooks.” Target 25 (2): 180–227.

Kruger, Haidee, and Jan-Louis Kruger

2017 “Cognition and Reception.” In The Handbook of Translation and Cognition, edited by John W. Schwieter and Aline Ferreira, 71–89. Hoboken: Wiley-Blackwell.

Lai, Tzu-Yun

2011 “Reliability and Validity of a Scale-based Assessment for Translation Tests.” Meta 56 (3): 713–722.

Li, Hang, and Lianzhen He

2015 “A Comparison of EFL Raters’ Essay-Rating Processes across Two Types of Rating Scales.” Language Assessment Quarterly 12 (2): 178–212.

Lumley, Tom

2002 “Assessment Criteria in a Large-Scale Writing Test: What Do They Really Mean to the Raters?” Language Testing 19 (3): 246–276.

Ma, Xingcheng, and Dechao Li

2020 “翻译教师和普通读者在译文在线评阅中的认知过程研究:基于眼动追踪数据的翻译质量评测 [Cognitive processes of translation teachers and ordinary readers in reading translated texts: An eye-tracking perspective to translation quality assessment]” Foreign Languages Research 4: 28–36.

Muñoz, Ricardo

2010 “Leave No Stone Unturned: On the Development of Cognitive Translatology.” Translation and Interpreting Studies 5 (2): 145–162.

Muñoz, Ricardo, and Tomás Conde

2007 “Effects of Serial Translation Evaluation.” In Translationsqualität [Translation quality], edited by Peter A. Schmit and Heike E. Jüngst, 428–444. Frankfurt: Peter Lang.

Obdržálková, Vanda

2018 “Directionality in Translation: Qualitative Aspects of Translation from and into English as a Non-Mother Tongue.” Sendebar 29: 35–57.

Pöchhacker, Franz

2002 “Researching Interpreting Quality: Models and Methods.” In Interpreting in the 21st Century: Challenges and Opportunities, edited by Giuliana Garzone and Maurizio Viezzi, 95–106. Amsterdam: John Benjamins.

Pokorn, Nike K., Jason Blake, Donald Reindl, and Agnes Pisanski Peterlin

2020 “The Influence of Directionality on the Quality of Translation Output in Educational Settings.” The Interpreter and Translator Trainer 14 (1): 58–78.

Rothe-Neves, Rui

2008 “Translation Quality Assessment for Research Purposes: An Empirical Approach.” Cadernos de Tradução [Translation notebooks] 2 (10): 113–131.

Sun, Sanjun, and Gregory M. Shreve

2014 “Measuring Translation Difficulty: An Empirical Study.” Target 26 (1): 98–127.

Turner, Barry, Miranda Lai, and Neng Huang

2010 “Error Deduction and Descriptors: A Comparison of Two Methods of Translation Test Assessment.” Translation & Interpreting 2 (1): 11–23.

Waddington, Christopher

2001 “Should Translations Be Assessed Holistically or through Error Analysis?” Hermes: Journal of Linguistics 14 (26): 15–38.

Walker, Callum

2019 “A Cognitive Perspective on Equivalent Effect: Using Eye Tracking to Measure Equivalence in Source Text and Target Text Cognitive Effects on Readers.” Perspectives 27 (1): 124–143.

Whyatt, Bogusława

2019 “In Search of Directionality Effects in the Translation Process and in the End Product.” Translation, Cognition and Behavior 2 (1): 79–100.

Winke, Paula, and Hyojung Lim

2015 “ESL Essay Raters’ Cognitive Processes in Applying the Jacobs et al. Rubric: An Eye-Movement Study.” Assessing Writing 25: 37–53.

Wolfe, Edward W.

1997 “The Relationship between Essay Reading Style and Scoring Proficiency in a Psychometric Scoring System.” Assessing Writing 4 (1): 83–106.

2005 “Uncovering Rater’s Cognitive Processing and Focus Using Think-Aloud Protocols.” Journal of Writing Assessment 2 (1): 37–56.

Zhang, Jie

2016 “Same Text Different Processing? Exploring How Raters’ Cognitive and Meta-Cognitive Strategies Influence Rating Accuracy in Essay Scoring.” Assessing Writing 27: 37–53.