Assessing the fidelity of consecutive interpreting: The effects of using source versus target text as the reference material

Han, Chao; Xiao, Rui; Su, Wei

doi:10.1075/intp.00058.han

Article published In: Interpreting
Vol. 23:2 (2021) ► pp.245–268

Get fulltext from our e-platform

Download PDF

Assessing the fidelity of consecutive interpreting

The effects of using source versus target text as the reference material

Chao Han | Xiamen University

Rui Xiao | Jiangxi University of Finance and Economics

Wei Su | Xiamen University

Published online: 5 February 2021

https://doi.org/10.1075/intp.00058.han

Abstract

The study reported on in this article pertains to rater-mediated assessment of English-to-Chinese consecutive interpreting, particularly informational correspondence between an originally intended message and an actually rendered message, also known as “fidelity” in Interpreting Studies. Previous literature has documented two main methods to assess fidelity: comparing actual renditions with the source text or with an exemplar rendition carefully prepared by experts (i.e., an ideal target text). However, little is known about the potential effects of these methods on fidelity assessment. We therefore conducted the study to explore the way in which these methods would affect rater reliability, fidelity ratings and rater perception. Our analysis of quantitative data shows that the raters tended to be less reliable, less self-consistent, less lenient and less comfortable when using the source English text (i.e., Condition A) than when using the target Chinese text (i.e., Condition B: the exemplar rendition). These findings were backed up and explained by emerging themes derived from the qualitative questionnaire data. The fidelity estimates in the two conditions were also found to be strongly correlated. We discuss these findings and entertain the possibility of recruiting untrained monolinguals or bilinguals to assess fidelity of interpreting.

Keywords: fidelity, consecutive interpreting, rater-mediated assessment, source text, target text

Article outline

Introduction
- Interpreting assessment: An overview of practice and research
- Assessment of fidelity in interpreting
- Source versus target texts as the reference material
- Research questions
Method
- Data source
- Raters
- Experimental design
- Rater training
- Rating procedure
- Post-hoc questionnaire
- Data analysis
Results
- Effects on rater reliability
- Effects on the fidelity ratings
- Raters’ perceptions of the reference materials
Discussion
Conclusion
Acknowledgment
References

References (43)

References

Angelelli, C. & Jacobson, H. E. (Eds.) (2009). Testing and assessment in translation and interpreting studies. Amsterdam: John Benjamins.

Barik, H. C. (1975). Simultaneous interpretation: Temporal and quantitative data. Language and Speech 16 (3), 237–270.

Bühler, H. (1986). Linguistic (semantic) and extra-linguistic (pragmatic) criteria for the evaluation of conference interpretation and interpreters. Mutlilingua 5 (4), 231–235.

Campbell, S. & Hale, S. (2003). Translation and interpreting assessment in the context of educational measurement. In G. Anderman & M. Rogers (Eds.), Translation today: Trends and perspectives. Clevedon: Multilingual Matters, 205–224.

Carroll, J. B. (1966). An experiment in evaluating the quality of translations. Mechanical Translation and Computational Linguistics 9 (3–4), 55–66.

Chesterman, A. (2016). Memes of translation: The spread of ideas in translation theory (revised edition). Amsterdam: John Benjamins.

Coughlin, D. (2003). Correlating automated and human assessments of machine translation quality. Retrieved from <[URL]>

Eckes, T. (2015). Introduction to many-facet Rasch measurement: Analyzing and evaluating rater-mediated assessments. Frankfurt am Main: Peter Lang.

Gerver, D. (1969/2002). The effects of source language presentation rate on the performance of simultaneous conference interpreters. In F. Pöchhacker & M. Shlesinger (Eds.), The interpreting studies reader. London: Routledge, 53–66.

Gile, D. (1995). Fidelity assessment in consecutive interpretation: An experiment. Target 7 (1), 151–164.

(1999). Variability in the perception of fidelity in simultaneous interpretation. Hermes 221, 51–79.

(2009). Interpreting studies: A critical view from within. MonTI 11, 135–155. [URL].

Hamidi, M. & Pöchhacker, F. (2007). Simultaneous consecutive interpreting: A new technique put to the test. Meta 52 (2), 276–289.

Han, C. (2015). Investigating rater severity/leniency in interpreter performance testing: A multifaceted Rasch measurement approach. Interpreting 17 (2), 255–283.

(2016). Investigating score dependability in English/Chinese interpreter certification performance testing: A generalizability theory approach. Language Assessment Quarterly 13 (3), 186–201.

(2017). Using analytic rating scales to assess English/Chinese bidirectional interpretation: A longitudinal Rasch analysis of scale utility and rater behavior. Linguistica Antverpiensia New Series – Themes in Translation Studies 161, 196–215.

(2018a). Using rating scales to assess interpretation: Practices, problems and prospects. Interpreting 20 (1), 59–95.

(2018b). Latent trait modelling of rater accuracy in formative peer assessment of English–Chinese consecutive interpreting. Assessment & Evaluation in Higher Education 43 (6), 979–994.

(2019). A generalizability theory study of optimal measurement design for a summative assessment of English/Chinese consecutive interpreting. Language Testing 36(3), 419–438.

Hlavac, J. (2013). A cross-national overview of translator and interpreter certification procedures. Translation & Interpreting 51, 32–65.

Lee, J. (2008). Rating scales for interpreting performance assessment. The Interpreter and Translator Trainer 2 (2), 165–184.

Lee, S-B. (2015). Developing an analytic scale for assessing undergraduate students’ consecutive interpreting performances. Interpreting 17 (2), 226–254.

(2019). Holistic assessment of consecutive interpretation: How interpreter trainers rate student performance. Interpreting 21 (2), 245–269.

Lee, T-H. (1999). Simultaneous listening and speaking in English into Korean simultaneous interpretation. Meta 44 (1), 560–572.

Liu, M-H. (2004). Working memory and expertise in simultaneous interpreting. Interpreting 6 (1), 19–42.

(2013). Design and analysis of Taiwan’s interpretation certification examination. In D. Tsagari & R. van Deemter (Eds.), Assessment issues in language translation and interpreting. Frankfurt: Peter Lang, 163–178.

Liu, M-H., Chang, C-C. & Wu, S-C. (2008). Interpretation evaluation practices: Comparison of eleven schools in Taiwan, China, Britain, and the USA. Compilation and Translation Review 1 (1), 1–42.

Liu, M-H. & Chiu, Y-H. (2009). Assessing source material difficulty for consecutive interpreting: Quantifiable measures and holistic judgment. Interpreting 11 (2), 244–266.

Meuleman, C. & Van Besien, F. (2009). Coping with extreme speech conditions in simultaneous interpreting. Interpreting 11 (1), 20–34.

Myford, C. M. & Wolfe, E. W. (2003). Detecting and measuring rater effects using many-facet Rasch measurement: Part I. Journal of Applied Measurement 4 (4), 386–422.

Pöchhacker, F. (2004). Introducing interpreting studies. London: Routledge.

Sawyer, D. B. (2004). Fundamental aspects of interpreter education: Curriculum and assessment. Amsterdam: John Benjamins.

Setton, R. & Dawrant, A. (2016). Conference interpreting: A trainer’s guide. Amsterdam: John Benjamins.

Setton, R. & Motta, M. (2007). Syntacrobatics: Quality and reformulation in simultaneous-with-text. Interpreting 9 (2), 199–230.

Skaaden, H. (2013). Assessing interpreter aptitude in a variety of languages. In D. Tsagari & R. van Deemter (Eds.), Assessment issues in language translation and interpreting. Frankfurt: Peter Lang, 35–50.

Stemler, S. E. & Tsai, J. (2008). Best practices in estimating interrater reliability: Three common approaches. In J. Osborne (Ed.), Best practices in quantitative methods. Thousand Oaks, CA: Sage, 29–49.

Tiselius, E. (2009). Revisiting Carroll’s scales. In C. V. Angelelli & H. E. Jacobson (Eds.), Testing and assessment in translation and interpreting studies. Amsterdam: John Benjamins, 95–121.

Tommola, J. & Helevä, M. (1998). Language direction and source text complexity: Effects on trainee performance in simultaneous interpreting. In L. Bowker, M. Cronin, D. Kenny & J. Pearson (Eds.), Unity in diversity? Current trends in translation studies. Manchester: St Jerome, 177–186.

Vermeiren, H., Gucht, J. V. & De Bontridder, L. (2009). Standards as critical success factors in assessments: Certifying social interpreters in Flanders, Belgium. In C. V. Angelelli & H. E. Jacobson (Eds.), Testing and assessment in translation and interpreting studies. Amsterdam: John Benjamins, 291–330.

Wang, W-W., Xu, Y., Wang, B-H. & Mu, L. (2020). Developing interpreting competence scales in China. Frontiers in Psychology 111, 481.

Wu, J., Liu, M. & Liao, C. (2013). Analytic scoring in interpretation test: Construct validity and the halo effect. In H-H. Liao, T-E. Kao & Y. Lin (Eds.), The making of a translator: Multiple perspectives. Taipei: Bookman, 277–292.

Wu, S. C. (2010). Assessing simultaneous interpreting: A study on test reliability and examiners’ assessment behavior. PhD thesis, Newcastle University.

Yeh, S.-P. & Liu, M. (2006). A more objective approach to interpretation evaluation: Exploring the use of scoring rubrics. Compilation and Translation Review 34 (4), 57–78.

Cited by (16)

Cited by 16 other publications

Order by:

Chen, Shiyue & Yan Lin

2025. A multidimensional comparison of ChatGPT, Google Translate, and DeepL in Chinese tourism texts translation: fidelity, fluency, cultural sensitivity, and persuasiveness. Frontiers in Artificial Intelligence 8

Han, Chao, Shirong Chen & Jia Feng

2025. Modeling rater cognition in translation assessment. Target. International Journal of Translation Studies 37:4 ► pp. 590 ff.

Han, Chao, Mengting Jiang & Qionglu Chen

2025. Rubricizing the assessment practice: A systematic review and meta-analysis of rubrics in rater-mediated assessment of language interpreting. Language Testing

Han, Chao, Xiaolei Lu & Shirong Chen

2025. Modeling rater judgments of interpreting quality: Ordinal logistic regression using neural-based evaluation metrics, acoustic fluency measures, and computational linguistic indices. Research Methods in Applied Linguistics 4:1 ► pp. 100194 ff.

Han, Chao & Yueqing Wang

2025. Conducting replication in translation and interpreting studies. Target. International Journal of Translation Studies 37:3 ► pp. 444 ff.

Li, Yang, Xini Liao & Jia Jia

2025. Effects of raters’ professional backgrounds on assessing interpreting quality: An exploratory mixed-methods investigation into rater behavior. System 133 ► pp. 103772 ff.

Al-Amin, Md., Fatematuz Zahra Saqui & Md. Rabbi Khan

2024. Enhancing Assessment Systems in Higher Education. In Utilizing AI for Assessment, Grading, and Feedback in Higher Education [Advances in Educational Technologies and Instructional Design, ], ► pp. 28 ff.

Wu, Guo-hua, You-jie Guo, Fang-zhou Qi, Shen Zhang, Yi-xiao Wang, Xin Tong & Liang Zhang

2024. Role of alloying and heat treatment on microstructure and mechanical properties of cast Al-Li alloys: A review. China Foundry 21:5 ► pp. 445 ff.

Cai, Rendong, Jiexuan Lin & Yanping Dong

2023. Psychological factors and interpreting competence in interpreting students: a developmental study. The Interpreter and Translator Trainer 17:2 ► pp. 246 ff.

Han, Chao, Juan Hu & Yi Deng

2023. Effects of language background and directionality on raters’ assessments of spoken-language interpreting. Revista Española de Lingüística Aplicada/Spanish Journal of Applied Linguistics 36:2 ► pp. 556 ff.

Han, Chao & Xiaolei Lu

2023. Can automated machine translation evaluation metrics be used to assess students’ interpretation in the language learning classroom?. Computer Assisted Language Learning 36:5-6 ► pp. 1064 ff.

Han, Chao & Xiaolei Lu

2025. Beyond BLEU: Repurposing neural-based metrics to assess interlingual interpreting in tertiary-level language learning settings. Research Methods in Applied Linguistics 4:1 ► pp. 100184 ff.

Han, Chao & Xiaoqi Shang

2023. An item-based, Rasch-calibrated approach to assessing translation quality. Target. International Journal of Translation Studies 35:1 ► pp. 63 ff.

Zhao, Nan

2023. A validation study of a consecutive interpreting test using many-facet Rasch analysis. Frontiers in Communication 7

Chen, Jing, Huabo Yang & Chao Han

2022. Holistic versus analytic scoring of spoken-language interpreting: a multi-perspectival comparative analysis. The Interpreter and Translator Trainer 16:4 ► pp. 558 ff.

Han, Chao

2022. Interpreting testing and assessment: A state-of-the-art review. Language Testing 39:1 ► pp. 30 ff.

This list is based on CrossRef data as of 12 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.