Navigating learner data in translator and interpreter training
Insights from the Chinese/English Translation and Interpreting Learner Corpus (CETILC)
Published online: 15 April 2022
https://doi.org/10.1075/babel.00260.pan
https://doi.org/10.1075/babel.00260.pan
Abstract
The development of technology, in particular, innovations in natural language processing and means to explore big
data, has influenced different aspects in the training of translators and interpreters. This paper investigates how learner
corpora and their research contribute to the teaching and learning of translation and interpreting. It starts with a review of the
evolvement of learner corpora in translator and interpreter training. Drawing on data from the Chinese/English Translation and
Interpreting Learner Corpus (CETILC), a learner corpus developed for the study of lexical cohesion, the paper introduces three
case studies to illustrate the possibilities of exploring learner data through human annotation, machine-facilitated human
annotation, and finally human-supervised/edited machine annotation. The findings of the case studies suggest the complexity of
learner language and its intricate relationships with various factors concerning the learner, text, and task. The paper ends with
a discussion of the great potentials of purposely made learner corpora such as the CETILC in translator and interpreter training,
as well as the application of learner corpora in (semi-) automatic processing of learner texts.
Résumé
Le développement de la technologie, notamment le traitement automatique des langues et les nouveaux moyens
d’explorer les mégadonnées, a beaucoup influencé la formation des traducteurs et des interprètes. Cet article a donc pour objectif
d’explorer comment les corpus d’apprenants et les recherches sur des corpus d’apprenants peuvent contribuer à l’enseignement et à
l’apprentissage de la traduction et de l’interprétation. Nous passons d’abord en revue l’évolution des corpus d’apprenants dans la
formation des traducteurs et des interprètes. Ensuite, à partir des données issues du Corpus d’apprenants en traduction et en
interprétation chinois-anglais (CETILC), un corpus développé pour l’étude de la cohésion lexicale, nous proposons trois cas
d’étude pour illustrer les possibilités d’explorer les données des apprenants à travers l’annotation manuelle, l’annotation
manuelle assistée par ordinateur et l’annotation automatique supervisée et éditée par l’humain. Les résultats de ces cas d’étude
montrent la complexité de la langue des apprenants, ainsi que les relations complexes entre la langue des apprenants et divers
facteurs tels que l’apprenant lui-même, le texte et la tâche à accomplir. Finalement, nous observons la grande utilité des corpus
d’apprenants, comme le CETILC, qui sont spécifiquement conçus pour la formation des traducteurs et des interprètes. Nous abordons
aussi l’application des corpus d’apprenants au traitement (semi-)automatique des textes produits par les apprenants.
Article outline
- 1.Introduction
- 2.Learner corpora in translation and interpreting
- 3.Learner corpora for translator and interpreter training: Perspectives from textual level performance
- 3.1The Chinese/English Translation and Interpreting Learner Corpus (CETILC)
- 3.2Case study 1: Learner differences in textual level translation performance
- 3.2.1Data description
- 3.2.2Findings and discussion
- 3.3Case study 2: Learner differences in the employment of cohesive devices
- 3.3.1Data description
- 3.3.2Findings and discussion
- 3.4Case study 3: Learners’ employment of lexical cohesive devices under different contexts
- 3.4.1Data description
- 3.4.2Annotation of lexical cohesive devices
- 3.4.3Findings and discussion
- 4.Implications and future directions
- 4.1T&I learner corpora: Implications from the three case studies
- 4.2Prospects for (semi-)automatic processing of T&I learner corpora
- 4.3Future development and application of the CETILC
- 5.Conclusion
- Acknowledgements
References
References (54)
Alfuraih, Reem F. 2020. “The Undergraduate Learner
Translator Corpus: A New Resource for Translation Studies and Computational Linguistics.” Lang
Resources and
Evaluation 54 (3): 801–830.
Angelelli, Claudia V., and Holly E. Jacobson, eds. 2009. Testing
and Assessment in Translation and Interpreting Studies: A Call for Dialogue between Research and
Practice. Amsterdam: John Benjamins Publishing Company.
Bowker, Lynne, and Peter Bennison. 2003. “Student
Translation Archive: Design, Development and Application.” In Corpora
in Translator Education, edited by Federico Zanettin, Silvia Bernardini, and Dominic Stewart, 103–117. London: Routledge.
Brezina, Vaclav, and Lynne Flowerdew. 2017. Learner
Corpus Research: New Perspectives and
Applications. London: Bloomsbury Publishing.
Centre for English Corpus
Linguistics. 2019. “Learner Corpora around the
World.” Louvain-la-Neuve: Université catholique de Louvain. [URL]
Danglli, Leonard, and Griselda Abazaj. 2014. “Lexical
Cohesion, Word Choice and Synonymy in Academic Writing.” Mediterranean Journal of Social
Sciences 14 (5): 628–632.
Díaz-Negrillo, Ana, Nicolas Ballier, and Paul Thompson, eds. 2013. Automatic
Treatment and Analysis of Learner Corpus Data. Amsterdam: John Benjamins Publishing Company.
Díaz-Negrillo, Ana, and Paul Thompson. 2013. “Learner
Corpora: Looking towards the Future.” In Automatic Treatment and
Analysis of Learner Corpus Data, edited by Ana Díaz-Negrillo, Nicolas Ballier, and Paul Thompson, 9–30. Amsterdam: John Benjamins Publishing Company.
Ficchi, Velia. 1999. “Learning
Consecutive Interpretation: An Empirical Study and an Autonomous
Approach.” Interpreting 4 (2): 199–218.
Gamon, Michael, Martin Chodorow, Claudia Leacock, and Joel Tetreault. 2013. “Using
Learner Corpora for Automatic Error Detection and
Correction.” In Automatic Treatment and Analysis of Learner Corpus
Data, edited by Ana Díaz-Negrillo, Nicolas Ballier, and Paul Thompson, 127–150. Amsterdam: John Benjamins Publishing Company.
Gile, Daniel. 2009. Basic
Concepts and Models for Interpreter and Translator Training, 2nd
ed. Amsterdam: John Benjamins Publishing Company.
. 1998b. “The
Computer Learner Corpus: A Versatile New Source of Data for SLA
Research.” In Learner English on
Computer, edited by Sylviane Granger, 3–18. London: Addison Wesley Longman.
Granger, Sylviane, Gaëtanelle Gilquin, and Fanny Meunier, eds. 2015. The
Cambridge Handbook of Learner Corpus
Research. Cambridge: Cambridge University Press.
Granger, Sylviane, Joseph Hung, and Stephanie Petch-Tyson. 2002a. “Preface.” In Computer
Learner Corpora, Second Language Acquisition, and Foreign Language Teaching, vol.6, edited
by Sylviane Granger, Joseph Hung and Stephanie Petch-Tyson, vii–viii. Amsterdam: John Benjamins Publishing Company.
, eds. 2002b. Computer
Learner Corpora, Second Language Acquisition, and Foreign Language
Teaching, vol. 61. Amsterdam: John Benjamins Publishing Company.
Granger, Sylviane, and Marie-Aude Lefer. 2018. “The
Translation-Oriented Annotation System: A Tripartite Annotation System for Translation
Research.” In International Symposium on Parallel Corpora (ECETT –
PaCor), Faculty of Philology, 5–7 November 2018, Compultense University of
Madrid, 61–63. [URL]
. 2020. “The
Multilingual Student Translation Corpus: A Resource for Translation Teaching and
Research.” Language Resources and
Evaluation 54 (4): 1183–1199.
Halliday, Michael A. K., and Ruqaiya Hasan. 1976. Cohesion
in English. London: Longman Group Limited.
Hatim, Basil. 1998. “Text
Linguistics and Translation.” In Encyclopedia of Translation
Studies, edited by Mona Baker, 262–265. London: Routledge.
Kalina, Sylvia. 2000. “Interpreting
Competences as a Basis and a Goal for Teaching.” Interpreter’s
Newsletter 101: 3–32.
Klaudy, Kinga, and Kristina Károly. 2002. “Lexical
Repetition in Professional and Trainees’ Translation.” In Teaching
Translation and Interpreting 4: Building Bridges, edited by Eva Hung, 99–114. Amsterdam: John Benjamins Publishing Company.
Li, Jen-Yu, and Thomas Gaillat. 2020. “Automatic
Detection of Unexpected/Erroneous Collocations in Learner
Corpus.” In Proceedings of Joint Workshop on Multiword Expressions
and Electronic
Lexicons, 101–106. Barcelona: Association for Computational Linguistics. [URL]
Lyashevkaya, Olga, and Irina Panteleeva. 2017. “Automatic
Dependency Parsing of a Learner English Corpus Realec.” Higher School of Economics Research
Paper No. WP BRP 62/LNG/2017.
Malmkjær, Kirsten, ed. 2004. Translation
in Undergraduate Degree Programmes. Amsterdam: John Benjamins Publishing Company.
. 2009. “What
is Translation Competence.” Revue Française de Linguistique
Appliquée [French journal of applied
linguistics] 11: 121–134.
Martínez Martínez, Jose Manuel, Ekaterina Lapshinova-Koltunski, and Kerstin Kunz. 2016. “Annotation
of Lexical Cohesion in English and German: Automatic and Manual
Procedures.” In Proceedings of the 13th Conference on Natural
Language Processing, KONVENS 2016, Bochum, Germany, 19–21 September
2016, 165–176. [URL]
McCulley, George A. 1985. “Writing Quality, Coherence, and
Cohesion.” Research in the Teaching of
English 19 (3): 269–282.
Nord, Christiane. 2005. Text
Analysis in Translation: Theory, Methodology, and Didactic Application of a Model for Translation-Oriented Text Analysis (No.
94). Amsterdam: Rodopi.
OpenCLC (v1). 2017. Open Cambridge Learner
Corpus, distributed by Lexical Computing Limited on Behalf of Cambridge University Press and Cambridge English Language Assessment. [URL]
Pan, Jun. 2012. “Problem
Analysis and the Learning of Interpreting: Perceptions, Evaluation and Corpus Analysis of Students’ Interpreting
Work.” Ph.D. diss., City University of Hong Kong.
. 2016. “A
Corpus-based Study of College Students’ Translation Performance: The Construction and Initial Findings of the
HK-CL(CE/EC)TC.” In MUST Kickoff Meeting, [URL] Also
in Sylviane, Granger, and Marie-Aude Lefer, eds. 2017. General
Report of the MUST Kickoff Meeting, 147–168. Louvain-la-Neuve: Centre for English Corpus Linguistics, Université catholique de Louvain.
Pan, Jun, and Honghua Wang. 2012. “Investigating
the Nature of the Semi-Natural Interpretation: A Case
Study.” In Interpreting Brian Harris: Recent Developments in
Translatology, edited by María Amparo, Jimenez Ivars, and María Jesús Blasco Mayor, 77–94. Bern: Perter Lang.
Pan, Jun, and Jackie Xiu Yan. 2012. “Learner
Variables and Problems Perceived by Students: An Investigation of a College Interpreting Program in
China.” Perspectives: Studies in
Translatology 20 (2): 199–218.
Pan, Jun, Honghua Wang, and Jackie Xiu Yan. 2017. “Convergences
and Divergences between Studies on Translator Training and Interpreter Training: Findings from a Database of English Journal
Articles.” Target: International Journal of Translation
Studies 29 (1): 110–144.
Russo, Mariachiara, Claudio Bendazzoli, and Bart Defrancq, eds. 2018. Making
Way in Corpus-based Interpreting
Studies. Singapore: Springer.
Selinker, Larry. (1972) 1983. “Interlanguage.” In Second
Language Learning: Contrastive Analysis, Error Analysis, and Related Aspects, edited
by Betty Wallace Robinett and Jacquelyn Schachter, 173–196. Ann Arbor: University of Michigan Press.
Shlesinger, Miriam. 1998. “Corpus-based
Interpreting Studies as an Offshoot of Corpus-based Translation Studies.” Meta: Translators’
Journal 43 (4): 486–493.
Smith, Karin Sim, and Lucia Specia. 2017. “Examining
Lexical Coherence in a Multilingual Setting.” In New Perspectives on
Cohesion and Coherence: Implications for Translation, edited by Katrin Menzel, Ekaterina Lapshinova-Koltunski, and Kerstin Kunz, 131–149. Berlin: Language Science Press.
Tiwari, Ashima, and Deepak Dembla. 2019. “A
Novel Algorithm for Automatic Text Summarisation System Using Lexical
Chain.” In Ambient Communications and Computer
Systems, edited by Yu-Chen Hu, Shailesh Tiwari, Krishn K. Mishra, and Munesh C. Trivedi, 103–112. Singapore: Springer.
Toury, Gideon. 1979. “Interlanguage
and Its Manifestation in Translation.” Meta: Translators’
Journal 24 (2): 223–231.
Wen, Qiufang 文秋芳, and Jinquan Wang 王金铨. 2008. Zhongguo
daxuesheng Ying-Han Han-Ying koubiyi yuliaoku 中国大学生英汉汉英口笔译语料库 [Parallel corpus of Chinese EFL
learners]. Beijing: Waiyu jiaoxue yu yanjiu chubanshe.
Williams, Sarah. 1994. “The
Application of SLA Research to Interpreting.” Perspectives: Studies in
Translatology 2 (1): 19–28.
Wong, Billy T. M., and Chunyu Kit. 2012. “Extending
Machine Translation Evaluation Metrics with Lexical Cohesion to Document
Level.” In Proceedings of the 2012 Joint Conference on Empirical
Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island, South Korea, 12–14 July
2012, 1060–1068. [URL]
Wong, Billy T. M., Cecilia F. K. Pun, Chunyu Kit, and Jonathan J. Webster. 2011. “Lexical
Cohesion for Evaluation of Machine Translation at Document
Level.” In 2012 7th International Conference on Natural Language
Processing and Knowledge Engineering (NLP-KE), Tokushima, Japan, 27–29 November
2011, 238–242. [URL]
Wu, Shuxuan. 2010. “Lexical
Cohesion in Oral English.” Journal of Language Teaching and
Research 1 (1): 97–101.
Yan, Jackie Xiu, Jun Pan, and Honghua Wang. 2010. “Learner
Factors, Self-Perceived Language Ability and Interpreting Learning: An Investigation of Hong Kong Tertiary Interpreting
Classes.” The Interpreter and Translator
Trainer 4 (2): 173–196.
Cited by (13)
Cited by 13 other publications
Wong, Billy T. M. & Venus W. M. Chan
Ji, Juan
Jian, Lihua
2024. Online Educational Resources Recommendation Algorithm for English Translation Course Based on Collaborative Filtering. In e-Learning, e-Education, and Online Training [Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 545], ► pp. 318 ff.
Li, Chunyao
Li, Zhiping
Luo, Yiping & Jie Pan
Su, Hongrui
Yuxiu, Yu
Jiang, Hongzhu
Pan, Jun, Billy Tak Ming Wong & Honghua Wang
Zamudio Padilla, Juan Diego & Liuqin Wang
Wang, Shengnan & Raghavan Dhanasekaran
This list is based on CrossRef data as of 2 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
