In:Parallel Corpora for Contrastive and Translation Studies: New resources and applications
Edited by Irene Doval and M. Teresa Sánchez Nieto
[Studies in Corpus Linguistics 90] 2019
► pp. 93–101
InterCorp
A parallel corpus of 40 languages
Published online: 20 March 2019
https://doi.org/10.1075/scl.90.06cer
https://doi.org/10.1075/scl.90.06cer
This chapter presents the current version of InterCorp, a parallel corpus created at the Faculty of Arts, Charles University in Prague. The corpus contains texts in Czech aligned with one or more foreign-language version(s), including Czech and 39 other languages. The chapter analyses its structure and technical parameters, and describes some technical tools used with the corpus (Kontext, a corpus query interface, and InterText, a parallel text alignment editor created specifically for the project). Similarly, the contribution discusses Treq (Translation Equivalents Database), a collection of bilingual Czech-foreign language dictionaries built automatically from InterCorp. In the last section of the chapter, the possibilities for methodological and linguistic exploitation of the corpus are discussed.
Article outline
- 1.Introduction
- 2.Description of the corpus
- 2.1The Spanish part of the corpus
- 3.Using the corpus
- 4.Specific tools: Translation equivalents database
- 5.Exploiting InterCorp
- 6.Conclusion
Acknowledgment References
References (16)
Čermák, František, Corness, Patrick & Klégr, Aleš (eds). 2010. InterCorp: Exploring a Multilingual Corpus. Prague: Nakladatelství Lidové Noviny & Ústav Českého národního korpusu.
Čermák, František & Rosen, Alexandr. 2012. The case of InterCorp, a multilingual parallel corpus. International Journal of Corpus Linguistics 17(3): 411–427.
Čermák, Petr. 2007. Acerca de los corpora paralelos: El proyecto Intercorp (About the parallel corpora: The Intercorp project). Verba 34: 375–380.
Machálek, Tomáš. 2016. Kontext. <[URL]> (18 November 2017).
Nádvorníková, Olga. 2017. Pièges méthodologiques des corpus parallèles et comment les éviter (Methodological traps of parallel corpora and how to avoid them). Corela. Cognition, Représentation, Langage HS-21: 1–28.
Och, Franz Josef & Ney, Hermann. 2003. A systematic comparison of various statistical alignment models. Computational Linguistics 29(1): 19–51.
Repository of bibliographical items based on the Czech National Corpus. 2017. <[URL]> (18 November 2017).
Rosen, Alexander & Vavřín, Martin. 2016. Korpus InterCorp, version 9 of 9 Sep 2016. Institute of the Czech National Corpus, Charles University, Prague 2014. <[URL]> (18 November 2017).
Rosen, Alexandr & Vavřín, Martin. 2012. Building a multilingual parallel corpus for human users. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), Nicoletta Calzolari et al. (eds), 2447–2452. Turkey: European Language Resources Association (ELRA).
Meurer, Paul. 2012. INESS-Search: A search system for LFG (and other) treebanks. In Proceedings of LFG12 Conference, Miriam, Butt & Tracy, H. King (eds). Stanford, CA: CSLI Publications).
Rosen, Alexandr. 2016. InterCorp – a look behind the façade of a parallel corpus. In Polskojęzyczne korpusy równoległe. Polish-language Parallel Corpora, Ewa Gruszczyńska & Agnieszka Leńko-Szymańska (eds.), 21–40. Warszawa: Instytut Lingwistyki Stosowanej.
Škrabal, Michal & Vavřín, Martin. 2017. The Translation Equivalents Database (Treq) as a Lexicographer’s Aid. In Electronic lexicography in the 21st century. Proceedings of eLex 2017 conference, Kosek Iztok et alii (eds.), 124–137. Leiden: Lexical Computing CZ s. r. o.
Štichauer, Pavel & Čermák, Petr. 2016. Causative constructions of the hacer / fare + verb type in Spanish and Italian and their Czech counterparts: A parallel corpus-based study. Linguistica Pragensia 26(2): 7–20.
TreeTagger. 2017. <[URL]> (18 November 2017).
Vondřička, Pavel. 2014. Aligning parallel texts with InterText. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), Nicoletta Calzolari et al. (eds), 1875–1879. Reykjavik: European Language Resources Association (ELRA).
. 2016. Intertext, Parallel Text Alignment Editor. <[URL]> (18 November 2017).
Cited by (2)
Cited by two other publications
Mikhailov, Mikhail
This list is based on CrossRef data as of 1 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
