Article published In: Revista Española de Lingüística Aplicada/Spanish Journal of Applied Linguistics
Vol. 36:2 (2023) ► pp.556–584
Effects of language background and directionality on raters’ assessments of spoken-language interpreting
Published online: 6 June 2023
https://doi.org/10.1075/resla.21009.han
https://doi.org/10.1075/resla.21009.han
Abstract
Bilingual raters play an important role in assessing spoken-language interpreting (between X and Y languages). Presumably, raters with X being the dominant language (DL) and Y the less DL can potentially differ, in terms of rating processes, from other raters with Y being the DL and X the less DL, when assessing either X-to-Y or Y-to-X interpreting. As such, raters’ language background and its interaction with interpreting directionality may influence assessment outcomes. However, this complex interaction and its effects on assessment have not been investigated. We therefore conducted the current experiment to explore how raters’ language background and interpreting directionality would affect assessment of English-Chinese, two-way interpreting. Our analyses of the quantitative data indicate that, when assessing interpreting into raters’ mother tongue or DL, they displayed a greater level of self-confidence and self-consistency, but rated performance more harshly. Such statistically significant group-level disparities led to different assessment outcomes, as pass and fail rates varied, depending on the rater group. These quantitative findings, coupled with the raters’ qualitative comments, may have implications for selection and training of bilingual raters for interpreting assessment.
Resumen
Efectos de los antecedentes lingüísticos y la direccionalidad en las evaluaciones de los evaluadores de la interpretación en lengua hablada
Los evaluadores bilingües juegan un papel importante en la evaluación de la interpretación en lengua hablada (entre las lenguas X e Y). Presumiblemente, los evaluadores en los que X es el idioma dominante (DL) e Y el menos DL pueden diferir potencialmente, en términos de procesos de calificación, de otros evaluadores en los que Y es el DL y X el menos DL, al evaluar X-to-Y o Interpretación de Y a X. Como tal, los antecedentes lingüísticos de los evaluadores y su interacción con la direccionalidad de la interpretación pueden influir en los resultados de la evaluación. Sin embargo, esta compleja interacción y sus efectos sobre la evaluación no han sido investigados. Por lo tanto, llevamos a cabo el experimento actual para explorar cómo los antecedentes lingüísticos de los evaluadores y la direccionalidad de la interpretación afectarían la evaluación de la interpretación bidireccional inglés-chino. Nuestros análisis de los datos cuantitativos indican que, al evaluar la interpretación a la lengua materna o DL de los evaluadores, mostraron un mayor nivel de confianza en sí mismos y autoconsistencia, pero calificaron el desempeño con mayor dureza. Tales disparidades estadísticamente significativas a nivel de grupo condujeron a diferentes resultados de evaluación, ya que las tasas de aprobación y reprobación variaron, según el grupo evaluador. Estos hallazgos cuantitativos, junto con los comentarios cualitativos de los evaluadores, pueden tener implicaciones para la selección y capacitación de evaluadores bilingües para interpretar la evaluación.
Article outline
- 1.Introduction
- 2.Literature review
- 2.1Rater characteristics as construct-irrelevant factors in ESL speaking assessment
- 2.2Raters’ language background in interpreting assessment
- 2.3Directionality in interpreting assessment
- 2.4Potential research niches
- 3.Research questions
- 4.Method
- 4.1Bilingual raters
- 4.2Interpreting recordings
- 4.3Research design
- 4.4Rater training
- 4.5Operational rating
- 4.6Data analysis
- 5.Results
- 5.1Raters’ internal consistency
- 5.2Rating severity
- 5.3Assessment outcomes
- 5.4Raters’ subjective perceptions of the rating process
- 6.Discussion
- 7.Conclusion
- Acknowledgements
- Notes
References
References (33)
Brown, A. (1995). The effect of rater variables in the development of an occupation-specific language performance test. Language Testing, 121, 1–15.
Carey, M. D., Mannell, R. H., & Dunn, P. K. (2011). Does a rater’s familiarity with a candidate’s pronunciation affect the rating in oral proficiency interviews? Language Testing, 281, 201–219.
Chen, J. (2009). Authenticity in accreditation tests for interpreters in China. The Interpreter and Translator Trainer, 31, 257–273.
Fayer, J. M., & Krasinski, E. (1987). Native and nonnative judgments of intelligibility and irritation. Language Learning, 371, 313–326.
Gile, D. (2009). Interpreting studies: A critical review from within. Monografías De Traducción E Interpretación, 11, 135–155.
Gui, M. (2012). Exploring differences between Chinese and American EFL teachers’ evaluation of speech performance. Language Assessment Quarterly, 91, 186–203.
Han, C. (2017). Using analytic rating scales to assess English-Chinese bi-directional interpreting: A longitudinal Rasch analysis of scale utility and rater behaviour. Linguistica Antverpiensia, New Series: Themes in Translation Studies, 161, 196–215.
(2018). A longitudinal quantitative investigation into the concurrent validity of self and peer assessment applied to English-Chinese bi-directional interpretation in an undergraduate interpreting course. Studies in Educational Evaluation, 581, 187–196.
(2019). A generalizability theory study of optimal measurement design for a summative assessment of English/Chinese consecutive interpreting. Language Testing, 361, 419–438.
(2022). Interpreting testing and assessment: A state-of-the-art review. Language Testing, 391, 30–55.
Han, C., & Riazi, M. (2018). The accuracy of student self-assessments of English-Chinese bidirectional interpretation: A longitudinal quantitative study. Assessment and Evaluation in Higher Education, 431, 386–398.
Han, C., Xiao, R., & Su, W. (2021). Assessing the fidelity of consecutive interpretation: The effects of using source versus target text as the reference material. Interpreting, 231, 245–268.
Hill, K. (1996). Who should be the judge? The use of non-native speakers as raters on a test of English as an international language. Melbourne Papers in Language Testing, 51, 29–49.
Huang, B. H. (2013). The effects of accent familiarity and language teaching experience on raters’ judgments of non-native speech. System, 411, 770–785.
Huang, B., Alegre, A., & Eisenberg, A. (2016). A cross-linguistic investigation of the effect of raters’ accent familiarity on speaking assessment. Language Assessment Quarterly, 131, 25–41.
Kim, Y-H. (2009a). A G-theory analysis of rater effect in ESL speaking assessment. Applied Linguistics, 301, 435–40.
(2009b). An investigation into native and non-native teachers’ judgments of oral English performance: a mixed methods approach. Language Testing, 261, 187–217.
Linacre, J. M. (2002). What do infit and outfit, mean-square and standardized mean? Rasch Measurement Transactions, 161, 878. [URL]
Liu, M-H. (2011). Methodology in interpreting studies: A methodological review of evidence-based research. In B. Nicodemus & L. Swabey (Eds.), Advances in interpreting research: Inquiry in action (pp. 85–120). John Benjamins.
(2013). Design and analysis of Taiwan’s interpretation certification examination. In D. Tsagari & R. van Deemter (Eds.), Assessment issues in language translation and interpreting (pp. 163–178). Peter Lang.
Liu, M-H., Chang, C-C., & Wu, S -C. (2008). Interpretation evaluation practices: Comparison of eleven schools in Taiwan, China, Britain, and the USA. Compilation and Translation Review, 11, 1–42.
Mellinger, C. D., & Hanson, T. A. (2017). Quantitative research methods in translation and interpreting studies. Routledge.
Sawyer, D. B. (2004). Fundamental aspects of interpreter education: Curriculum and assessment. John Benjamins.
Setton, R., & Dawrant, A. (2016). Conference interpreting: A trainer’s guide. John Benjamins.
Skaaden, H. (2013). Assessing interpreter aptitude in a variety of languages. In D. Tsagari & R. van Deemter (Eds.), Assessment issues in language translation and interpreting (pp. 35–50). Peter Lang.
Su, W. (2019). Exploring native English teachers’ and native Chinese teachers’ assessment of interpreting. Language and Education, 331, 577–594.
Wei, J., & Llosa, L. (2015). Investigating differences between American and Indian raters in assessing TOEFL iBT speaking tasks. Language Assessment Quarterly, 121, 283–304.
Wink, P., Gass, S., & Myford, C. (2012). Raters’ L2 background as a potential source of bias in rating oral performance. Language Testing, 301, 231–252.
Xi, X., & Mollaun, P. (2011). Using raters from India to score a large-scale speaking test. Language Learning, 611, 1222–1255.
Yeh, S-P., & Liu, M. (2006). A more objective approach to interpretation evaluation: Exploring the use of scoring rubrics. Journal of the National Institute for Compilation and Translation, 341, 57–78.
Cited by (3)
Cited by three other publications
Hu, Ting, Citing Li, Haiming Xu & Min Tang
Li, Yang, Xini Liao & Jia Jia
This list is based on CrossRef data as of 30 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
