Functionally-defined recurrent multi-word units in English-to-Polish translation: A corpus-based study

Grabowski, Łukasz; Groom, Nicholas

doi:10.1075/resla.19037.gra

Article published In: Revista Española de Lingüística Aplicada/Spanish Journal of Applied Linguistics
Vol. 35:1 (2022) ► pp.1–29

Get fulltext from our e-platform

Download PDF

Download EPUB

Functionally-defined recurrent multi-word units in English-to-Polish translation

A corpus-based study

Łukasz Grabowski | University of Opole

Nicholas Groom | University of Birmingham

Published online: 23 December 2021

https://doi.org/10.1075/resla.19037.gra

Abstract

This study uses both parallel and comparable reference corpora in the English-Polish language pair to explore how translators deal with recurrent multi-word items performing specific discoursal functions. We also consider whether the observed tendencies overlap with those found in native texts, and the extent to which the discoursal functions realised by the multi-word items under scrutiny are “preserved” in translation. Capitalizing on findings from earlier research (; ), we analyzed a pre-selected set of phrases signaling stance-taking and those functioning as textual, discourse-structuring devices originally found in the European Parliament proceedings corpus () and included in the English-Polish parallel corpus Paralela (). Since our goal was to explore whether and to what extent English functionally-defined phrases reflect the same level of formulaicity and regularity in both Polish translations and native Polish texts, the findings provided insights into the translation tendencies of such items, and revealed – using inter-rater agreement metrics – that the discoursal functions of recurrent n-grams may change in translation.

Keywords: corpus-based study, translation studies, parallel corpora, comparable corpora, metadiscursive multi-word items, translation patterns, English-to-Polish translation

Resumen

Unidades multipalabra recurrentes y funcionalmente definidas en la traducción del inglés al polaco: Un estudio basado en corpus

El presente estudio utiliza corpus de referencia paralelos y comparables en el par de idiomas inglés-polaco a fin de explorar cómo tratan los traductores las secuencias de palabras recurrentes que realizan funciones discursivas específicas. También analizamos si las tendencias observadas se superponen con las encontradas en los textos nativos, y en qué medida las funciones discursivas realizadas por las unidades multipalabra halladas se conservan en la traducción. Valiéndonos de los resultados de investigaciones anteriores (; ), analizamos un conjunto preseleccionado de frases que señalan ya sea la toma de postura o aquellas que tienen una función textual en la estructuración del discurso, procedentes originalmente del corpus del Parlamento Europeo () e incluidas en el Corpus paralelo inglés-polaco paralelo Paralela (). Dado que nuestro objetivo consiste en explorar si las frases definidas funcionalmente en inglés reflejan, o hasta qué punto lo hacen, el mismo nivel de formulaicidad y regularidad tanto en las traducciones polacas como en los textos nativos polacos, los resultados aportan conocimiento sobre las tendencias en la traducción de tales elementos, y revelan – utilizando métricas sobre acuerdo de inter-evaluadores – que las funciones discursivas de los n-gramas recurrentes pueden cambiar en la traducción.

Palabras clave: estudio basado en corpus, estudios de traducción, corpus paralelos, corpus comparables, unidades multipalabra, patrones traductológicos, traducción inglés-polaco

Article outline

1.Introduction
2.Metadiscourse and corpus-based translation studies
3.Methodology
- 3.1Research material
- 3.2Units of analysis
- 3.3Research questions and hypotheses
- 3.4Procedures and study stages
4.Results
- 4.1Recurrent n-grams in English-to-Polish translation in the EPP sub-corpus of Paralela
- 4.2Polish equivalents in native-Polish texts
- 4.3Discourse functions in English-to-Polish translation of functionally-defined n-grams
5.Conclusions
Notes
References
Dictionaries (accessed online)
Corpora

References (71)

References

Baker, M. (1993). Corpus Linguistic and Translation Studies: Implications and applications. In M. Baker, G. Francis and E. Tognini-Bonelli (Eds.), Text and Technology: in Honour of John Sinclair (pp. 233–250). Amsterdam: John Benjamins.

(2000). Towards a Methodology for Investigating the Style of a Literary Translator. Target, 12(2), 241–266.

(2004). A corpus-based view of similarity and difference in translation. International Journal of Corpus Linguistics, 9(2), 167–193.

Bernardini, S. (2004). Corpus-aided pedagogy for translator education. In K. Malmkjaer (Ed.), Translation in Undergraduate Degree Programmes (pp. 97–112). Amsterdam: John Benjamins.

Berūkštienė, D. (2017). A corpus-driven analysis of structural types of lexical bundles in court judgments in English and their translation into Lithuanian. Kalbotyra, 701, 7–31.

Biber, D. (2006). University Language. A corpus-based study of spoken and written registers. Amsterdam: John Benjamins.

Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). The Longman Grammar of Spoken and Written English. London: Longman.

Biber, D., & Zhang, M. (2018). Expressing evaluation without grammatical stance: informational persuasion on the web. Corpora, 13(1), 97–123.

Brezina, V. (2018). Statistics in Corpus Linguistics: A Practical Guide. Cambridge: Cambridge University Press.

Chesterman, A. (2004). Beyond the particular. In A. Mauranen & P. Kuyamaki (Eds.), Translation Universals: Do they exist? (pp. 33–49). Amsterdam: John Benjamins.

Cortes, V. (2008). A comparative analysis of lexical bundles in academic history writing in English and Spanish. Corpora, 3(1), 43–57.

Davidson, D. (1986). A Nice Derangement of Epitaphs. In E. Lapore (Ed.), Truth and Interpretation: Perspectives on the Philosophy of Donald Davidson (pp. 433–446). Oxford: Blackwell (cited in Malmkjær, 2003, p. 127).

Dłutek, A. (2019). Expressing Modality in Commercial Agreements and Contracts – An Analysis of Polish-English Parallel Texts. In B. Lewandowska-Tomaszczyk (Ed.), Contacts and Contrasts in Cultures and Languages (pp. 35–46). Berlin: Springer.

Dobrovol’skij, D., & Pöppel, L. (2016). The discursive construction n v tom, čto and its parallels in other languages: A contrastive corpus study. Вестник Новосибирского государственного педагогического университета, 6(34), 164–175.

(2017). Constructions in Parallel Corpora: A Quantitative Approach. In R. Mitkov (Ed.), Computational and Corpus-Based Phraseology: Proceedings of Second International Conference, Europhras 2017, London, UK, November 13–14, 2017. Berlin: Springer, 41–53.

Dupont, M., & Zufferey, S. (2017). Methodological issues in the use of directional parallel corpora. A case study of English and French concessive connectives. International Journal of Corpus Linguistics, 22(2), 270–297.

Ebeling, J., & Ebeling, S. O. (2018). Comparing n-gram-based functional categories in original versus translated texts. Corpora, 13(3), 347–370.

Forchini, P., & Murphy, A. (2008). N-grams in comparable specialized corpora. Perspectives on phraseology, translation and pedagogy. International Journal of Corpus Linguistics, 13(3), 351–367.

Freelon, D. (2010). ReCal: Intercoder reliability calculation as a web service. International Journal of Internet Science, 5(1), 20–33.

Górski, R. (2012). Zastosowanie korpusów w badaniu gramatyki. In A. Przepiórkowski, M. Bańko, R. Górski & B. Lewandowska-Tomaszczyk (Eds.), Narodowy Korpus Języka Polskiego (pp. 291–300). Warszawa: Wydawnictwo Naukowe PWN.

Grabar, N., & Lefer, M-A. (29 Jun–1 Jul 2015). Building a lexical bundle resource for CAT and MT. [PDF File] Workshop on Multi-word Units in Machine Translation and Translation Technology (MUMTTT2015) of EUROPHRAS 2015, Malaga, Spain. [URL]

Grabowski, Ł. (2013). Interfacing corpus linguistics and computational stylistics: translation universals in translational literary Polish. International Journal of Corpus Linguistics, 18 (2): 254–280.

(2018). Stance bundles in English-to-Polish translation: a corpus-informed study. Russian Journal of Linguistics, 22 (2), 404–422.

(2020). Phrase frames as an exploratory tool for studying English-to-Polish translation patterns: a descriptive corpus-based study. Across Languages and Cultures, 21 (2), 217–240.

Grabowski, Ł. & Groom, N. (2021). “Grammar patterns as an exploratory tool for studying formulaicity in English-to-Polish translation: a corpus-based study”. In: A. Trklja & Ł. Grabowski (Eds), Formulaic Language: Theories and Methods. Berlin: Language Science Press, 171–190.

Granger, S. (2014). A lexical bundle approach to comparing languages. Stems in English and French. Languages in Contrast, 14(1), 58–72.

Granger, S., & Paquot, M. (2008). Disentangling the phraseological web. In S. Granger & F. Meunier (Eds), Phraseology: An interdisciplinary perspective (pp. 27–50). Amsterdam: John Benjamins.

Granger, S. & Lefer, M.-A. (2016). From general to learners’ bilingual dictionaries: Towards a more effective fulfilment of advanced learners’ phraseological needs. International Journal of Lexicography, 29(3), 279–295.

Hareide, L. (2019). Comparable parallel corpora. A critical review of current practices in corpus-based translation studies. In I. Doval & M. Teresa Sánchez Nieto (Eds), Parallel Corpora for Contrastive and Translation Studies: New resources and applications (pp. 19–38). Amsterdam: John Benjamins.

Hu, X., Xiao, R., & Hardie, A. (2016). How do English translations differ from non-translated English writings? A multi-feature statistical model for linguistic variation analysis. Corpus Linguistics and Linguistic Theory, aop.

Hunston, S. (2011). Corpus Approaches to Evaluation: Phraseology and Evaluative Language. New York, NY: Routledge.

Hunston, S., & Thompson, G. (2000). Evaluation in Text: Authorial Stance and the Construction of Discourse. Oxford: Oxford University Press.

Hyland, K. (2005). Stance and engagement: a model of interaction in academic discourse. Discourse Studies, 7(2), 173–192.

(2008). As can be seen: Lexical bundles and disciplinary variation. English for Specific Purposes, 271, 4–21.

Kilgarriff, A., Baisa, V., Bušta, J., Jakubícek, M., Kovář, V., Michelfeit, J., Rychlý, P. & Suchomel, V. (2014). The Sketch Engine: ten years on. Lexicography, 1(1), 7–36.

Koehn, P. (2005). Europarl: A Parallel Corpus for Statistical Machine Translation. In Conference Proceedings: the tenth Machine Translation Summit, Phuket, Thailand: AAMT, 79–86.

Kornacki, M. (2018). Computer-assisted Translation (CAT) Tools in the Translator Training Process. Frankfurt: Peter Lang Verlag.

Krippendorff, K. (2011). Computing Krippendorff’s alpha-reliability. Philadelphia: Annenberg School for Communication Departmental Papers. Retrieved 7 April, 2019 from: [URL]

Laviosa, S. (2002). Corpus-based translation studies: theory, findings, applications. Amsterdam: Rodopi.

Lee, C. (2013). Using lexical bundle analysis as discovery tool for corpus-based translation research. Perspectives, 21(3), 378–395.

Lewandowska-Tomaszczyk, B., & Pęzik, P. (2018). Parallel and comparable language corpora, cluster equivalence and translator education. In Society and Languages in the Third Millennium – Communication.Education.Translation. Moscow: RUDN University, 131–142.

Malamatidou, S. (2018). Corpus Triangulation. Combining Data and Methods in Corpus-Based Translation Studies. London: Routledge.

Malmkjær, K. (2003). On a Pseudo-subversive use of Corpora in Translator Training. In F. Zanettin, S. Bernardini, D. Stewart (Eds.), Corpora in Translator Education (pp. 119–134). Manchester: St. Jerome.

Marco, J., & van Lawick, H. (2009). Using corpora and retrieval software as a source of materials for the translation classroom. In A. Beeby, A. Rodriguez Ines & P. Sanchez-Gijon (Eds.), Corpus Use and Translating (pp. 9–28). Amsterdam: John Benjamins.

Marco, J. (2019). Living with parallel corpora. In I. Doval & M. Teresa Sánchez Nieto (Eds.), Parallel Corpora for Contrastive and Translation Studies: New resources and applications (pp. 39–56). Amsterdam: John Benjamins.

Marshall, S. (2015). Evidential stance in translation: patterns of complementation in mediated memories. The Translator, 21(1), 50–67.

Martin, J. & White, P. (2005). Language of Evaluation: Appraisal in English. London: Palgrave Macmillan.

Mor, A. (2018). Do translation memories affect translations? Final results of the TRACE project. Perspectives: Studies in Translation Theory and Practice, 27(3), 455–476.

Munday, J. (2012). Evaluation in translation. Critical points for translator decision-making. London: Routledge.

Nishina, J. (2010). Evaluative Meanings and Disciplinary Values: A Corpus-based Study of Adjective Patterns in Research Articles in Applied Linguistics and Business Studies. [Unpublished PhD dissertation: University of Birmingham].

Noreika, M. & Seskauskiene, I. (2017). “EU Regulations: Tendencies in Translating Lexical Bundles from English into Lithuanian”. Vertimo Studijos, 101, 156–174 [

Pearson, J. (2003). Using Parallel Texts in the Translator Training Environment. In F. Zanettin, S. Bernardini, D. Stewart (Eds), Corpora in Translator Education (pp. 15–24). Manchester: St. Jerome.

Pęzik, P. (2012). Język mówiony w NKJP. In A. Przepiórkowski, M. Bańko, R. Górski & B. Lewandowska-Tomaszczyk (Eds.), Narodowy Korpus Języka Polskiego (pp. 37–47). Warszawa: Wydawnictwo Naukowe PWN.

(2016). Exploring phraseological equivalence with Paralela. In E. Gruszczyńska & A. Leńko-Szymańska (Eds.), Polish-Language Parallel Corpora (pp. 67–81). Warszawa: Instytut Lingwistyki Stosowanej UW.

Philip, G. (2009). Arriving at equivalence: Making a case for comparable general reference corpora. In A. Beeby, A. Rodriguez Ines & P. Sanchez-Gijon (Eds), Corpus Use and Translating (pp. 59–74). Amsterdam: John Benjamins.

Przepiórkowski, A., Bańko, M., Górski, R. & Lewandowska-Tomaszczyk, B. (Eds.) (2012). Narodowy Korpus Języka Polskiego (National Corpus of Polish). Warszawa: Wydawnictwo Naukowe PWN.

Pym, A. (2014). Exploring Translation Theories. Second Edition. London: Routledge.

Roemer, U. (2008). Identification impossible? A corpus approach to realisation of evaluative meaning in academic language. Functions of Language, 15(1), 115–130.

Sinclair, J. (1991). Corpus, Concordance, Collocation. Oxford: Oxford University Press.

Teubert, W. (2002). The role of parallel corpora in translation and multilingual lexicography. In B. Altenberg & S. Granger (Eds.), Lexis in Contrast. Corpus-based approaches (pp. 189–214). Amsterdam: John Benjamins.

Tognini-Bonelli, E. (1996). Towards Translation Equivalence from a Corpus Linguistics Perspective. International Journal of Lexicography, 9(3), 197–217.

Zanettin, F., Bernardini, S. & Stewart, P. (2003). Corpora in Translator Education. London: Routledge.

Dictionaries (accessed online)

COBUILD Advanced English Dictionary (2019)

Easy Learning Idioms Dictionary (2019)

Longman Dictionary of Contemporary English (LDOCE 2019)

McGraw-Hill Dictionary of American Idioms and Phrasal Verbs (2002)

Macmillan Dictionary (2019)

Słownik Języka Polskiego (SJP) PWN (2019)

Corpora

Paralela (Pęzik, 2016)

National Corpus of Polish (NKJP) (Przepiórkowski et al., 2012)

plTenTen12 (Kilgarriff et al., 2014)

Cited by (7)

Cited by seven other publications

Order by:

Li, Xiaoli

2024. A deep learning-based intelligent method for mining sentiment words in english translation texts. Engineering Research Express 6:4 ► pp. 045237 ff.

Li, Yang & Ewa Gumul

2024. Stance-taking lexical bundles in interpreted diplomatic discourse. A corpus-informed approach. Perspectives ► pp. 1 ff.

Wang, Hui

2024. Machine Translation Based on Neural Network: A Case Study of Est Translation. In Application of Big Data, Blockchain, and Internet of Things for Education Informatization [Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 582], ► pp. 15 ff.

Alatengqiqige

2023. Analysis of the Summary of the 20th National Congress of the Communist Party of China under English Vocabulary Corpus. International Journal of Education and Humanities 6:3 ► pp. 28 ff.

Wu, Zhen & Guo Wang

2023. Proceedings of the 2023 5th Asia Pacific Information Technology Conference, ► pp. 69 ff.

Zhang, Lili

2023. An IoT-based English translation and teaching using particle swarm optimization and neural network algorithm. Soft Computing 27:19 ► pp. 14431 ff.

Zhao, Hui, Kexin Jin, Jing Wang & Abid Yahya

2022. Automatic Recognition and Extraction of English Verb Types Based on Index Line Clustering. Mobile Information Systems 2022 ► pp. 1 ff.

This list is based on CrossRef data as of 30 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.