Article published In: Revista Española de Lingüística Aplicada/Spanish Journal of Applied Linguistics
Vol. 35:1 (2022) ► pp.1–29
Functionally-defined recurrent multi-word units in English-to-Polish translation
A corpus-based study
Published online: 23 December 2021
https://doi.org/10.1075/resla.19037.gra
https://doi.org/10.1075/resla.19037.gra
Abstract
This study uses both parallel and comparable reference corpora in the English-Polish language pair to explore how
translators deal with recurrent multi-word items performing specific discoursal functions. We also consider whether the observed
tendencies overlap with those found in native texts, and the extent to which the discoursal functions realised by the multi-word
items under scrutiny are “preserved” in translation. Capitalizing on findings from earlier research (Granger, S. (2014). A lexical bundle approach to comparing languages. Stems in English and French. Languages in Contrast, 14(1), 58–72. ; Grabar, N., & Lefer, M-A. (29 Jun–1 Jul 2015). Building a lexical bundle resource for CAT and MT. [PDF File] Workshop on Multi-word Units in Machine Translation and Translation Technology (MUMTTT2015) of EUROPHRAS 2015, Malaga, Spain. [URL]), we analyzed a
pre-selected set of phrases signaling stance-taking and those functioning as textual, discourse-structuring devices originally
found in the European Parliament proceedings corpus (Koehn, P. (2005). Europarl: A Parallel Corpus for Statistical Machine Translation. In Conference Proceedings: the tenth Machine Translation Summit, Phuket, Thailand: AAMT, 79–86.) and included in the
English-Polish parallel corpus Paralela ( (2016). Exploring phraseological equivalence with Paralela. In E. Gruszczyńska & A. Leńko-Szymańska (Eds.), Polish-Language Parallel Corpora (pp. 67–81). Warszawa: Instytut Lingwistyki Stosowanej UW.). Since our goal
was to explore whether and to what extent English functionally-defined phrases reflect the same level of formulaicity and
regularity in both Polish translations and native Polish texts, the findings provided insights into the translation tendencies of
such items, and revealed – using inter-rater agreement metrics – that the discoursal functions of recurrent n-grams may change in
translation.
Resumen
Unidades multipalabra recurrentes y funcionalmente definidas en la traducción del inglés al polaco: Un estudio basado en corpus
El presente estudio utiliza corpus de referencia paralelos y comparables en el par de idiomas inglés-polaco
a fin de explorar cómo tratan los traductores las secuencias de palabras recurrentes que realizan funciones discursivas
específicas. También analizamos si las tendencias observadas se superponen con las encontradas en los textos nativos, y en qué
medida las funciones discursivas realizadas por las unidades multipalabra halladas se conservan en la traducción. Valiéndonos de
los resultados de investigaciones anteriores (Granger, S. (2014). A lexical bundle approach to comparing languages. Stems in English and French. Languages in Contrast, 14(1), 58–72. ; Grabar, N., & Lefer, M-A. (29 Jun–1 Jul 2015). Building a lexical bundle resource for CAT and MT. [PDF File] Workshop on Multi-word Units in Machine Translation and Translation Technology (MUMTTT2015) of EUROPHRAS 2015, Malaga, Spain. [URL]), analizamos un conjunto preseleccionado de frases que señalan ya sea la toma de
postura o aquellas que tienen una función textual en la estructuración del discurso, procedentes originalmente del corpus del
Parlamento Europeo (Koehn, P. (2005). Europarl: A Parallel Corpus for Statistical Machine Translation. In Conference Proceedings: the tenth Machine Translation Summit, Phuket, Thailand: AAMT, 79–86.) e incluidas en el Corpus paralelo inglés-polaco
paralelo Paralela ( (2016). Exploring phraseological equivalence with Paralela. In E. Gruszczyńska & A. Leńko-Szymańska (Eds.), Polish-Language Parallel Corpora (pp. 67–81). Warszawa: Instytut Lingwistyki Stosowanej UW.). Dado que nuestro objetivo consiste en explorar si las
frases definidas funcionalmente en inglés reflejan, o hasta qué punto lo hacen, el mismo nivel de formulaicidad y regularidad
tanto en las traducciones polacas como en los textos nativos polacos, los resultados aportan conocimiento sobre las tendencias en
la traducción de tales elementos, y revelan – utilizando métricas sobre acuerdo de inter-evaluadores – que las funciones
discursivas de los n-gramas recurrentes pueden cambiar en la traducción.
Article outline
- 1.Introduction
- 2.Metadiscourse and corpus-based translation studies
- 3.Methodology
- 3.1Research material
- 3.2Units of analysis
- 3.3Research questions and hypotheses
- 3.4Procedures and study stages
- 4.Results
- 4.1Recurrent n-grams in English-to-Polish translation in the EPP sub-corpus of Paralela
- 4.2Polish equivalents in native-Polish texts
- 4.3Discourse functions in English-to-Polish translation of functionally-defined n-grams
- 5.Conclusions
- Notes
References Dictionaries (accessed online) Corpora
References (71)
Baker, M. (1993). Corpus Linguistic and Translation Studies: Implications and applications. In M. Baker, G. Francis and E. Tognini-Bonelli (Eds.), Text and Technology: in Honour of John Sinclair (pp. 233–250). Amsterdam: John Benjamins.
(2000). Towards a Methodology for Investigating the Style of a Literary Translator. Target, 12(2), 241–266.
(2004). A corpus-based view of similarity and difference in translation. International Journal of Corpus Linguistics, 9(2), 167–193.
Bernardini, S. (2004). Corpus-aided pedagogy for translator education. In K. Malmkjaer (Ed.), Translation in Undergraduate Degree Programmes (pp. 97–112). Amsterdam: John Benjamins.
Berūkštienė, D. (2017). A corpus-driven analysis of structural types of lexical bundles in court judgments in English and their translation into Lithuanian. Kalbotyra, 701, 7–31.
Biber, D. (2006). University Language. A corpus-based study of spoken and written registers. Amsterdam: John Benjamins.
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). The Longman Grammar of Spoken and Written English. London: Longman.
Biber, D., & Zhang, M. (2018). Expressing evaluation without grammatical stance: informational persuasion on the web. Corpora, 13(1), 97–123.
Brezina, V. (2018). Statistics in Corpus Linguistics: A Practical Guide. Cambridge: Cambridge University Press.
Chesterman, A. (2004). Beyond the particular. In A. Mauranen & P. Kuyamaki (Eds.), Translation Universals: Do they exist? (pp. 33–49). Amsterdam: John Benjamins.
Cortes, V. (2008). A comparative analysis of lexical bundles in academic history writing in English and Spanish. Corpora, 3(1), 43–57.
Davidson, D. (1986). A Nice Derangement of Epitaphs. In E. Lapore (Ed.), Truth and Interpretation: Perspectives on the Philosophy of Donald Davidson (pp. 433–446). Oxford: Blackwell (cited in Malmkjær, 2003, p. 127).
Dłutek, A. (2019). Expressing Modality in Commercial Agreements and Contracts – An Analysis of Polish-English Parallel Texts. In B. Lewandowska-Tomaszczyk (Ed.), Contacts and Contrasts in Cultures and Languages (pp. 35–46). Berlin: Springer.
Dobrovol’skij, D., & Pöppel, L. (2016). The discursive construction n v tom, čto and its parallels in other languages: A contrastive corpus study. Вестник Новосибирского государственного педагогического университета, 6(34), 164–175.
(2017). Constructions in Parallel Corpora: A Quantitative Approach. In R. Mitkov (Ed.), Computational and Corpus-Based Phraseology: Proceedings of Second International Conference, Europhras 2017, London, UK, November 13–14, 2017. Berlin: Springer, 41–53.
Dupont, M., & Zufferey, S. (2017). Methodological issues in the use of directional parallel corpora. A case study of English and French concessive connectives. International Journal of Corpus Linguistics, 22(2), 270–297.
Ebeling, J., & Ebeling, S. O. (2018). Comparing n-gram-based functional categories in original versus translated texts. Corpora, 13(3), 347–370.
Forchini, P., & Murphy, A. (2008). N-grams in comparable specialized corpora. Perspectives on phraseology, translation and pedagogy. International Journal of Corpus Linguistics, 13(3), 351–367.
Freelon, D. (2010). ReCal: Intercoder reliability calculation as a web service. International Journal of Internet Science, 5(1), 20–33.
Górski, R. (2012). Zastosowanie korpusów w badaniu gramatyki. In A. Przepiórkowski, M. Bańko, R. Górski & B. Lewandowska-Tomaszczyk (Eds.), Narodowy Korpus Języka Polskiego (pp. 291–300). Warszawa: Wydawnictwo Naukowe PWN.
Grabar, N., & Lefer, M-A. (29 Jun–1 Jul 2015). Building a lexical bundle resource for CAT and MT. [PDF File] Workshop on Multi-word Units in Machine Translation and Translation Technology (MUMTTT2015) of EUROPHRAS 2015, Malaga, Spain. [URL]
Grabowski, Ł. (2013). Interfacing corpus linguistics and computational stylistics: translation universals in translational literary Polish. International Journal of Corpus Linguistics, 18 (2): 254–280.
(2018). Stance bundles in English-to-Polish translation: a corpus-informed study. Russian Journal of Linguistics, 22 (2), 404–422.
(2020). Phrase frames as an exploratory tool for studying English-to-Polish translation patterns: a descriptive corpus-based study. Across Languages and Cultures, 21 (2), 217–240.
Grabowski, Ł. & Groom, N. (2021). “Grammar patterns as an exploratory tool for studying formulaicity in English-to-Polish translation: a corpus-based study”. In: A. Trklja & Ł. Grabowski (Eds), Formulaic Language: Theories and Methods. Berlin: Language Science Press, 171–190.
Granger, S. (2014). A lexical bundle approach to comparing languages. Stems in English and French. Languages in Contrast, 14(1), 58–72.
Granger, S., & Paquot, M. (2008). Disentangling the phraseological web. In S. Granger & F. Meunier (Eds), Phraseology: An interdisciplinary perspective (pp. 27–50). Amsterdam: John Benjamins.
Granger, S. & Lefer, M.-A. (2016). From general to learners’ bilingual dictionaries: Towards a more effective fulfilment of advanced learners’ phraseological needs. International Journal of Lexicography, 29(3), 279–295.
Hareide, L. (2019). Comparable parallel corpora. A critical review of current practices in corpus-based translation studies. In I. Doval & M. Teresa Sánchez Nieto (Eds), Parallel Corpora for Contrastive and Translation Studies: New resources and applications (pp. 19–38). Amsterdam: John Benjamins.
Hu, X., Xiao, R., & Hardie, A. (2016). How do English translations differ from non-translated English writings? A multi-feature statistical model for linguistic variation analysis. Corpus Linguistics and Linguistic Theory, aop.
Hunston, S. (2011). Corpus Approaches to Evaluation: Phraseology and Evaluative Language. New York, NY: Routledge.
Hunston, S., & Thompson, G. (2000). Evaluation in Text: Authorial Stance and the Construction of Discourse. Oxford: Oxford University Press.
Hyland, K. (2005). Stance and engagement: a model of interaction in academic discourse. Discourse Studies, 7(2), 173–192.
(2008). As can be seen: Lexical bundles and disciplinary variation. English for Specific Purposes, 271, 4–21.
Kilgarriff, A., Baisa, V., Bušta, J., Jakubícek, M., Kovář, V., Michelfeit, J., Rychlý, P. & Suchomel, V. (2014). The Sketch Engine: ten years on. Lexicography, 1(1), 7–36.
Koehn, P. (2005). Europarl: A Parallel Corpus for Statistical Machine Translation. In Conference Proceedings: the tenth Machine Translation Summit, Phuket, Thailand: AAMT, 79–86.
Kornacki, M. (2018). Computer-assisted Translation (CAT) Tools in the Translator Training Process. Frankfurt: Peter Lang Verlag.
Krippendorff, K. (2011). Computing Krippendorff’s alpha-reliability. Philadelphia: Annenberg School for Communication Departmental Papers. Retrieved 7 April, 2019 from: [URL]
Laviosa, S. (2002). Corpus-based translation studies: theory, findings, applications. Amsterdam: Rodopi.
Lee, C. (2013). Using lexical bundle analysis as discovery tool for corpus-based translation research. Perspectives, 21(3), 378–395.
Lewandowska-Tomaszczyk, B., & Pęzik, P. (2018). Parallel and comparable language corpora, cluster equivalence and translator education. In Society and Languages in the Third Millennium – Communication.Education.Translation. Moscow: RUDN University, 131–142.
Malamatidou, S. (2018). Corpus Triangulation. Combining Data and Methods in Corpus-Based Translation Studies. London: Routledge.
Malmkjær, K. (2003). On a Pseudo-subversive use of Corpora in Translator Training. In F. Zanettin, S. Bernardini, D. Stewart (Eds.), Corpora in Translator Education (pp. 119–134). Manchester: St. Jerome.
Marco, J., & van Lawick, H. (2009). Using corpora and retrieval software as a source of materials for the translation classroom. In A. Beeby, A. Rodriguez Ines & P. Sanchez-Gijon (Eds.), Corpus Use and Translating (pp. 9–28). Amsterdam: John Benjamins.
Marco, J. (2019). Living with parallel corpora. In I. Doval & M. Teresa Sánchez Nieto (Eds.), Parallel Corpora for Contrastive and Translation Studies: New resources and applications (pp. 39–56). Amsterdam: John Benjamins.
Marshall, S. (2015). Evidential stance in translation: patterns of complementation in mediated memories. The Translator, 21(1), 50–67.
Martin, J. & White, P. (2005). Language of Evaluation: Appraisal in English. London: Palgrave Macmillan.
Mor, A. (2018). Do translation memories affect translations? Final results of the TRACE project. Perspectives: Studies in Translation Theory and Practice, 27(3), 455–476.
Munday, J. (2012). Evaluation in translation. Critical points for translator decision-making. London: Routledge.
Nishina, J. (2010). Evaluative Meanings and Disciplinary Values: A Corpus-based Study of Adjective Patterns in Research Articles in Applied Linguistics and Business Studies. [Unpublished PhD dissertation: University of Birmingham].
Noreika, M. & Seskauskiene, I. (2017). “EU Regulations: Tendencies in Translating Lexical Bundles from English into Lithuanian”. Vertimo Studijos, 101, 156–174 [
Pearson, J. (2003). Using Parallel Texts in the Translator Training Environment. In F. Zanettin, S. Bernardini, D. Stewart (Eds), Corpora in Translator Education (pp. 15–24). Manchester: St. Jerome.
Pęzik, P. (2012). Język mówiony w NKJP. In A. Przepiórkowski, M. Bańko, R. Górski & B. Lewandowska-Tomaszczyk (Eds.), Narodowy Korpus Języka Polskiego (pp. 37–47). Warszawa: Wydawnictwo Naukowe PWN.
(2016). Exploring phraseological equivalence with Paralela. In E. Gruszczyńska & A. Leńko-Szymańska (Eds.), Polish-Language Parallel Corpora (pp. 67–81). Warszawa: Instytut Lingwistyki Stosowanej UW.
Philip, G. (2009). Arriving at equivalence: Making a case for comparable general reference corpora. In A. Beeby, A. Rodriguez Ines & P. Sanchez-Gijon (Eds), Corpus Use and Translating (pp. 59–74). Amsterdam: John Benjamins.
Przepiórkowski, A., Bańko, M., Górski, R. & Lewandowska-Tomaszczyk, B. (Eds.) (2012). Narodowy Korpus Języka Polskiego (National Corpus of Polish). Warszawa: Wydawnictwo Naukowe PWN.
Roemer, U. (2008). Identification impossible? A corpus approach to realisation of evaluative meaning in academic language. Functions of Language, 15(1), 115–130.
Teubert, W. (2002). The role of parallel corpora in translation and multilingual lexicography. In B. Altenberg & S. Granger (Eds.), Lexis in Contrast. Corpus-based approaches (pp. 189–214). Amsterdam: John Benjamins.
Cited by (7)
Cited by seven other publications
Li, Xiaoli
Li, Yang & Ewa Gumul
Wang, Hui
2024. Machine Translation Based on Neural Network: A Case Study of Est Translation. In Application of Big Data, Blockchain, and Internet of Things for Education Informatization [Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 582], ► pp. 15 ff.
Alatengqiqige
Wu, Zhen & Guo Wang
Zhang, Lili
This list is based on CrossRef data as of 30 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
