Article published In: Japanese Term Extraction
Kyo Kageura and Teruo Koyama
[Terminology 6:2] 2000
► pp. 233–256
Term recognition using corpora from different fields
Published online: 1 October 2001
https://doi.org/10.1075/term.6.2.07uch
https://doi.org/10.1075/term.6.2.07uch
We present a system used in the term recognition competition, one of the subtasks covered by the NTCIR tmrec group, and we evaluate its term recognition results. We regard that terms are lexical items, characteristic of a field, which have the following three features: (1) they appear frequently in documents of the target field; (2) they are not common words in the target field; and (3) they appear less frequently in the corpora of other fields. Our system uses corpora from different fields and uses these features to recognize terms.
We then analyze the differences between our term list and the manual candidates list produced by the NTCIR tmrec group. In this article we identify features that are important for automatic term recognition. Furthermore, through comparative experiments based on manual candidates, we establish the importance of indices in extracting a term list.
Keywords: field, term recognition, corpora, document frequency, term frequency
Cited by (5)
Cited by five other publications
Ishida, Youichi, Toshiyuki Shimizu & Masatoshi Yoshikawa
Ittoo, Ashwin & Gosse Bouma
KUBO, Junko, Keita TSUJI & Shigeo SUGIMOTO
Liu, Xiao-Yue & Chunyu Kit
This list is based on CrossRef data as of 6 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
