Article published In: Terminology
Vol. 19:1 (2013) ► pp.1–30
TExSIS
Bilingual terminology extraction from parallel corpora using chunk-based alignment
Published online: 29 April 2013
https://doi.org/10.1075/term.19.1.01mac
https://doi.org/10.1075/term.19.1.01mac
We report on TExSIS, a flexible bilingual terminology extraction system that uses a sophisticated chunk-based alignment method for the generation of candidate terms, after which the specificity of the candidate terms is determined by combining several statistical filters. Although the set-up of the architecture is largely language-independent, we present terminology extraction results for four different languages and three language pairs. Gold standard data sets were created for French-Italian, French-English and French-Dutch, which allowed us not only to evaluate precision, which is common practice, but also recall. We compared the TExSIS approach, which takes a multilingual perspective from the start, with the more commonly used approach of first identifying term candidates monolingually and then aligning the source and target terms. A comparison of our system with the LUIZ approach described by Vintar (2010) reveals that TExSIS outperforms LUIZ both for monolingual and bilingual terminology extraction. Our results also clearly show that the precision of the alignment is crucial for the success of the terminology extraction. Furthermore, based on the observation that the precision scores for bilingual terminology extraction outperform those of the monolingual systems, we conclude that multilingual evidence helps to determine unithood in less related languages.
Cited by (33)
Cited by 33 other publications
Xu, Kang, Yifan Feng, Qiandi Li, Zhenjiang Dong & Jianxiang Wei
Lefever, Els & Ayla Rigouts Terryn
Condamines, Anne, Marie-Pierre Escoubas Benveniste & Silvia Federzoni
Andersen, Gisle
2022. Utilising heterogeneous language resources for term extraction in maritime domains. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication 28:1 ► pp. 1 ff.
Hörberg, Thomas, Maria Larsson & Jonas K. Olofsson
Repar, Andraž, Vid Podpečan, Anže Vavpetič, Nada Lavrač & Senja Pollak
2022. TermEnsembler. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication ► pp. 93 ff.
Du, Jiali, Christina Alexantris & Pingfang Yu
Kwong, Oi Yee
2021. User-driven assessment of commercial term extractors. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication 27:2 ► pp. 179 ff.
Croijmans, Ilja, Iris Hendrickx, Els Lefever, Asifa Majid & Antal Van Den Bosch
Martinez-Rodriguez, Jose L., Aidan Hogan, Ivan Lopez-Arevalo & Andreas Hotho
Ngo, The Quyen, My Linh Ha, Thi Minh Huyen Nguyen, Thi Mai Huong Hoang & Viet Hung Nguyen
Repar, Andraž, Matej Martinc & Senja Pollak
Rigouts Terryn, Ayla, Véronique Hoste & Els Lefever
Rigouts Terryn, Ayla, Véronique Hoste & Els Lefever
2021. HAMLET. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication 27:2 ► pp. 254 ff.
Rigouts Terryn, Ayla, Véronique Hoste & Els Lefever
2022. Tagging terms in text. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication 28:1 ► pp. 157 ff.
Shvedenko, V. N., O. V. Shchekochikhin & Y. A. Sinkevich
Шведенко, В.Н., Shvedenko, V.N., О.В. Щекочихин, О.V. Shchekochikhin, Е.А. Синкевич & Е.А. Sinkevich
Dash, Niladri Sekhar & L. Ramamoorthy
Horák, Aleš, Vít Baisa, Adam Rambousek & Vít Suchomel
Hoste, Veronique, Klaar Vanopstal, Ayla Rigouts Terryn & Els Lefever
Kessler, Remy, Nicolas Bechet & Giuseppe Berio
Mennes, Julie, Ted Pedersen & Els Lefever
Tien, Ha Nguyen, Quyen Ngo The, Huyen Nguyen Thi Minh & Linh Ha My
Desmet, Bart & Véronique Hoste
Macken, Lieve & Arda Tezcan
2018. Dutch compound splitting for bilingual terminology extraction. In Multiword Units in Machine Translation and Translation Technology [Current Issues in Linguistic Theory, 341], ► pp. 147 ff.
Mennes, Julie
Vandepitte, Sonia & Els Lefever
Zhao, Chongchong, Chao Dong & Xiaoming Zhang
Oliver, Antoni
Lefever, Els
Xiong, Deyi, Fandong Meng & Qun Liu
Hanoulle, Sabien, Véronique Hoste & Aline Remael
Lefever, Els, Marjan Van de Kauter & Véronique Hoste
2014. HypoTerm. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication 20:2 ► pp. 250 ff.
This list is based on CrossRef data as of 6 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
