Article published In: International Journal of Corpus Linguistics
Vol. 8:1 (2003) ► pp.109–127
Automatic extraction of meaningful units from corpora
A corpus-driven approach using the word stroke
Published online: 14 August 2003
https://doi.org/10.1075/ijcl.8.1.06dan
https://doi.org/10.1075/ijcl.8.1.06dan
In this article, we will reconsider the notion of a word as the basic unit of analysis in language and propose that in an information and meaning carrying system the unit of analysis should be a unit of meaning (UM). Such a UM may consist of one or more words. A method will be promoted that attempts to automatically retrieve UMs from corpora. To illustrate the results that may be obtained by this method, the node word ‘stroke’ will be used in a small study. The results will be discussed, with implications considered for both monolingual and multilingual use. The monolingual study will benefit from using the British National Corpus, while the multilingual study introduces a parallel corpus consisting of Swedish novels and their translations into English.
Cited by (12)
Cited by 12 other publications
Danielsson, Pernilla & Ulrika Westrup
Mikhailov, Mikhail
Fu, Rongbo & Jing Chen
2019. Negotiating interpersonal relations in Chinese-English diplomatic interpreting. Interpreting. International Journal of Research and Practice in Interpreting 21:1 ► pp. 12 ff.
Krishnamurthy, Ramesh
Fu, Rongbo
Larner, Samuel
Gjerdingen, Robert O.
MinDeokGi, 심규남 & Lee Seung Min
Winters, Marion
2009. Modal particles explained. Target. International Journal of Translation Studies 21:1 ► pp. 74 ff.
Scharl, Arno & Albert Weichselbraun
Zhang, Wen, Taketoshi Yoshida & Xijin Tang
This list is based on CrossRef data as of 12 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
