The COVALT PAR_ES Corpus (EN/FR/DE > ES): Indexation and analysis of a parallel corpus using CQPweb

Molés-Cases, Teresa; Oster, Ulrike

doi:10.1075/scl.90.12mol

In:Parallel Corpora for Contrastive and Translation Studies: New resources and applications
Edited by Irene Doval and M. Teresa Sánchez Nieto
[Studies in Corpus Linguistics 90] 2019
► pp. 197–214

Get fulltext from our e-platform

Download Book PDF

Indexation and analysis of a parallel corpus using CQPweb

The COVALT PAR_ES Corpus (EN/FR/DE > ES)

Teresa Molés-Cases | Universitat Politècnica de València, Universitat Jaume I

Ulrike Oster | Universitat Politècnica de València, Universitat Jaume I

Published online: 20 March 2019

https://doi.org/10.1075/scl.90.12mol

This contribution presents a section of the Corpus Valencià de Literatura Traduïda (COVALT), created by the research group of the same name (Department of Translation and Communication, Universitat Jaume I, Spain). The COVALT corpus is a four-million word corpus made up of narrative works originally written in English, French, and German and their Catalan translations published in the autonomous community of Valencia between 1990 and 2000. Since the members of the Covalt group are interested in translation research, and more specifically in the investigation of translated Catalan and Spanish, this corpus has recently been extended to include translations into Spanish published in Spain (COVALT PAR_ES corpus). This chapter presents the COVALT PAR_ES corpus, as well as its process of compilation and analysis with CQPweb.

Keywords: corpus compilation, corpus indexation, CQPweb, COVALT corpus

Article outline

1.Introduction
2.The corpora
3.Corpus compilation and indexation
- 3.1Preparation of texts
- 3.2Uploading the files to CQPweb
  - Step 1: Creating directories
  - Step 2: Encoding and indexing corpora in CWB
  - Step 3: Aligning the subcorpora
  - Step 4: Copying the files to CQPweb
  - Step 5: Activating the corpora on the web interface
4.Corpus analysis
5.Conclusion
Acknowledgements
Notes
References

References (30)

References

Badia, Toni, Pujol, Manel, Tuells, Antoni, Vivaldi, Jorge, Yzaguirre, Lluís de & Cabré, Mª Teresa. 1998. IULA’s LSP multilingual corpus: Compilation and processing. In Proceedings of ELRA Conference, 29–31 May 1998, Universidad de Granada.

Christ, Oliver, Schulze, Bruno, M., Hofmann, Anja & König, Esther. 1999. The Open IMS Corpus Workbench. Corpus Query Processor. User’s Manual. Stuttgart: University of Stuttgart. <[URL]> (29 March 2017).

Doval, Irene, Fernández Lanza, Santiago, Jiménez Juliá, Tomás, Liste Lamas, Elsa & Lübke, Barbara. This volume. Corpus PaGeS: A multifunctional resource for language learning, translation and cross-linguistic research. In Parallel Corpora for Contrastive and Translation Studies: New Resources and Applications [Studies in Corpus Linguistics 90], Irene Doval & M. Teresa Sánchez (eds). Amsterdam: John Benjamins.

Evert, Stefan & The CWB Development Team. 2016: The IMS Open Corpus WorkBench (CWB). Corpus Encoding Tutorial. <[URL]> (20 October 2016).

Frankenberg-Garcia, Anna & Santos, Diana. 2003. Introducing Compara, the Portuguese English parallel corpus. In Corpora in Translator Education, Federico Zanettin, Silvia Bernardini & Dominic Stewart (eds), 71–87. Manchester: St. Jerome.

Gómez Guinovart, Xavier & Sacau Fontela, Elena. 2004. Parallel corpora for the Galician Language: Building and processing of the CLUVI (Linguistic Corpus of the University of Vigo). In Proceedings of the Fourth International Conference on Language Resources and Evaluation, 26–28 May 2004, Lisbon.

Guzman, Josep R. 2013. El corpus COVALT i l’eina d’alineament de frases Alfra-COVALT. In El corpus COVALT: un observatori de fraseologia traduïda, Llum Bracho Lapiedra (ed.), 49–60. Aachen: Shaker Verlag.

2015a. Puntuació i traducció: Verführung i Der Tangospieler. Quaderns – Revista de Traducció 22: 217–232.

2015b. Segmentation and regrouping of sentences. Lenguaje y Textos 42: 97–105.

2016. La traducció de la modalitat deóntica i epistèmica del verb modal sollen en el corpus COVALT. Zeitschrift für Katalanistik 29: 135–165.

Hardie, Andrew. 2012. CQPweb – combining power, flexibility and usability in a corpus analysis tool. International Journal of Corpus Linguistics 17(3): 380–409. <[URL]> (27 April 2017).

. 2014. The IMS Open Corpus Workbench (CWB) CQPweb System Administrator’s Manual. <[URL]> (8 October 2016).

. 2016. The IMS Open Corpus Workbench (CWB). CQPweb System Administrator’s Manual. <[URL]> (20 November 2016).

Johansson, Stig. 2004. Multilingual corpora: models, methods, use. TradTerm 10: 59–82.

Marco, Josep. 2013a. Tracing marked collocation in translated and non-translated literary language: A case study based on a parallel and comparable corpus. In Tracks and Treks in Translation Studies [Benjamins Translation Library 108], Catherine Way, Sonia Vandepitte, Reine Meylaerts & Magdalena Bartłomiejczyk (eds), 167–188. Amsterdam: John Benjamins.

. 2013b. La traducció de les unitats fraseològiques de base somàtica en el subcorpus angles català. In El corpus COVALT: Un observatori de fraseologia traduïda, Llum Bracho Lapiedra (ed.), 163–216. Aachen: Shaker Verlag.

. 2018a The translation of food-related culture-specific items in the COVALT corpus: A study of techniques and factors. Perspectives. .

. 2018b Connectives as indicators of explicitation in literary translation: A study based on a comparable and parallel corpus. Target 30(1): 87–111.

Martínez Vilinsky, Bárbara. 2016. La infrarrepresentación de elementos únicos en textos traducidos de ingles a español: perífrasis verbales, demostrativos y sufijos apreciativos en un corpus comparable y paralelo de novel policíaca. PhD dissertation, Universitat Jaume I, Castelló de la Plana, Spain.

Mikhailov, Mikhail & Cooper, Robert. 2016. Corpus Linguistics for Translation and Contrastive Studies. London: Routledge.

Molés-Cases, Teresa. 2016a. La traducción de los eventos de movimiento en un corpus paralelo alemán-español de literatura infantil y juvenil. Frankfurt: Peter Lang.

. 2016b. Compilación y análisis de un corpus paralelo para la investigación en traducción. Proyecto con Déjà Vu, Treetagger e IMS Open Corpus Workbench. RLA (Revista de Lingüística Teórica y Aplicada) 54(1): 149–174.

Oksefjell, Signe. 1999. A description of the English – Norwegian parallel corpus: Compilation and further developments. International Journal of Corpus Linguistics 4(2): 197–219.

Oster, Ulrike & van Lawick, Heike. 2013. Anàlisi dels somatismes del subcorpus alemany-català. In El corpus COVALT: Un observatori de fraseologia traduïda, Llum Bracho Lapiedra (ed.), 267–294. Aachen: Shaker.

Oster, Ulrike & Molés-Cases, Teresa. 2016. Eating and drinking seen through translation: A study of food-related translation difficulties and techniques in a parallel corpus of literary texts. Across Languages and Cultures 17(1): 53–75.

Przepiórkowski, Adam, Górski, Rafał L., Łaziński, Marek & Pezik, Piotr. 2010. Recent developments in the National Corpus of Polish. In Proceedings of the International Conference on Language Resources and Evaluation, 17–23 May 2010, Valleta, Malta.

Sanjurjo-González, Hugo & Izquierdo, Marlen. This volume. P-ACTRES 2.0: A parallel corpus for cross-linguistic research. In Parallel Corpora for Contrastive and Translation Studies: New Resources and Applications [Studies in Corpus Linguistics 90], Irene Doval & M. Teresa Sánchez (eds). Amsterdam: John Benjamins.

Verdegal, Joan. 2013. Les unitats fraseològiqus somàtiques franceses i catalanes en COVALT: Localització, freqüència i anàlisi. In El corpus COVALT: un observatori de fraseologia traduïda, Llum Bracho Lapiedra (ed.), 217–266. Aachen: Shaker.

. 2014. Traduir l’emoció: metodologia i resultats. In Homenatge a Germà Colón. Labor omnia improbus vincit, Rosa Agost & Lluís Gimeno (eds), 251–279. Castelló: Publicacions de la Universitat Jaume I.

Zubillaga, Naroa, Sanz, Zuriñe & Uribarri, Ibon. 2015. Building a trilingual parallel corpus to analyse literary translations from German into Basque. In New Directions in Corpus-based Translation Studies, Claudio Fantinuoli & Federico Zanettin (eds), 71–92. Berlin: Language Science Press.

Cited by (3)

Cited by three other publications

Oster, Ulrike & Isabel Tello

2024. Between source language constructions and target language expectations. Review of Cognitive Linguistics

Oster, Ulrike

2023. Translating emotions. Languages in Contrast 23:2 ► pp. 199 ff.

Doval, Irene, Santiago Fernández Lanza, Tomás Jiménez Juliá, Elsa Liste Lamas & Barbara Lübke

2019. Corpus PaGeS. In Parallel Corpora for Contrastive and Translation Studies [Studies in Corpus Linguistics, 90], ► pp. 103 ff.

This list is based on CrossRef data as of 1 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.

Indexation and analysis of a parallel corpus using CQPweb

The COVALT PAR_ES Corpus (EN/FR/DE > ES)

Cited by three other publications

The COVALT PAR_ES Corpus (EN/FR/DE > ES)