Article published In: Terminology
Vol. 14:2 (2008) ► pp.204–229
Measuring mono-word termhood by rank difference via corpus comparison
Published online: 12 December 2008
https://doi.org/10.1075/term.14.2.05kit
https://doi.org/10.1075/term.14.2.05kit
Terminology as a set of concept carriers crystallizes our special knowledge about a subject. Automatic term recognition (ATR) plays a critical role in the processing and management of various kinds of information, knowledge and documents, e.g., knowledge acquisition via text mining. Measuring termhood properly is one of the core issues involved in ATR. This article presents a novel approach to termhood measurement for mono-word terms via corpus comparison, which quantifies the termhood of a term candidate as its rank difference in a domain and a background corpus. Our ATR experiments to identify legal terms in Hong Kong (HK) legal texts with the British National Corpus (BNC) as background corpus provide evidence to confirm the validity and effectiveness of this approach. Without any prior knowledge and ad hoc heuristics, it achieves a precision of 97.0% on the top 1000 candidates and a precision of 96.1% on the top 10% candidates that are most highly ranked by the termhood measure, illustrating a state-of-the-art performance on mono-word ATR in the field.
Cited by (30)
Cited by 30 other publications
Xie, Tong, Yuwei Wan, Haoran Wang, Ina Østrøm, Shaozhou Wang, Mingrui He, Rong Deng, Xinyuan Wu, Clara Grazian, Chunyu Kit & Bram Hoex
Abulaish, Muhammad, Mohd Fazil & Mohammed J. Zaki
Acosta, Olga & César Aguilar
Kováříková, Dominika
Kováříková, Dominika
Kwong, Oi Yee
2021. User-driven assessment of commercial term extractors. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication 27:2 ► pp. 179 ff.
Abulaish, Muhammad, Mohd Fazil & Tarique Anwar
Sajgalik, Marius, Michal Barla & Maria Bielikova
Abduljabbar, Waleed Khalid, Saadiyaa A. Tomah & Ammar Abdulateef Ali
Elejalde, Erick, Leo Ferres, Eelco Herder & Dante R. Chialvo
Cheng, Hao, Markus Rokicki & Eelco Herder
Lopes, Lucelene, Paulo Fernandes & Renata Vieira
Nugumanova, Aliya, Igor Bessmertny, Yerzhan Baiburin & Madina Mansurova
Pérez, María José Marín
2016. Measuring the degree of specialisation of sub-technical legal terms through corpus comparison. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication 22:1 ► pp. 80 ff.
Heylen, Kris & Dirk De Hertog
2015. Automatic Term Extraction. In Handbook of Terminology [Handbook of Terminology, 1], ► pp. 203 ff.
Lopes, Lucelene, Paulo Fernandes, Roger Granada & Renata Vieira
Lopes, Lucelene & Renata Vieira
Rizzo, Camino Rea & María José Marín Pérez
da Silva Conrado, Merley, Ariani Di Felippo, Thiago Alexandre Salgueiro Pardo & Solange Oliveira Rezende
Marín, María José
Marín, María José
2023. Automatic term recognition and legal language. In Handbook of Terminology [Handbook of Terminology, 3], ► pp. 511 ff.
Pecman, Mojca
2014. Variation as a cognitive device. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication 20:1 ► pp. 1 ff.
Conrado, Merley S., Rafael G. Rossi, Thiago A. S. Pardo & Solange O. Rezende
Fernandes, Paulo, Luis O.C. Furquim & Lucelene Lopes
Kyriacopoulou, Tita, Olympia Tsaknaki & Eleni Tziafa
Liu, Sa & Chengzhi Zhang
Chuang, Jason, Christopher D. Manning & Jeffrey Heer
Zhang, Chengzhi & Dan Wu
Liu, Xiao-Yue & Chunyu Kit
[no author supplied]
This list is based on CrossRef data as of 6 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
