Article published In: International Journal of Corpus Linguistics
Vol. 7:1 (2002) ► pp.1–20
The IJS-ELAN Slovene-English Parallel Corpus
Published online: 18 October 2002
https://doi.org/10.1075/ijcl.7.1.01erj
https://doi.org/10.1075/ijcl.7.1.01erj
The paper presents an annotated parallel Slovene-English corpus developed in the scope of the EU ELAN project. The IJS-ELAN corpus was compiled to be a widely distributable dataset for language engineering and for translation and terminology studies. The corpus contains 1 million words from fifteen recent terminology-rich texts. The corpus is sentence aligned and word-tagged with context disambiguated morphosyntactic descriptions and lemmas. These descriptions model simple feature structures, the structure of which is shared between Slovene and English. The corpus is encoded according to the Guidelines for Text Encoding and Interchange and is freely available on the Web for downloading. Additionally, access to IJS-ELAN is available via a powerful Web concordancer.
Keywords: parallel corpus, corpus encoding, tagging, concordancing
Cited by (6)
Cited by six other publications
Rai, Pooja & Sanjay Chatterji
Mizushima, Kota, Atusi Maeda & Yoshinori Yamaguchi
Žganec-Gros, Jerneja & Stanislav Gruden
Dias, Gaël & Špela Vintar
Žganec-Gros, Jerneja, France Mihelič, Tomaž Erjavec & Špela Vintar
This list is based on CrossRef data as of 12 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
