Chapter 14. Terminology and distributional analysis of corpora

Bertels, Ann

doi:10.1075/tlrp.23.14ber

In:Theoretical Perspectives on Terminology: Explaining terms, concepts and specialized knowledge
Edited by Pamela Faber and Marie-Claude L'Homme
[Terminology and Lexicography Research and Practice 23] 2022
► pp. 311–328

Get fulltext from our e-platform

Download Book PDF

Download Book EPUB

Chapter 14
Terminology and distributional analysis of corpora

Ann Bertels | University of Leuven

Published online: 14 June 2022

https://doi.org/10.1075/tlrp.23.14ber

Abstract

This chapter discusses the theoretical and methodological principles of distributional semantic analysis. Over the last decade, Distributional Semantics has become very popular in Corpus Linguistics, building on very large corpora and extracting useful semantic information for numerous applications. In our “big data” era, Artificial Intelligence (AI) is carving its way into our daily life. AI’s algorithms in Natural Language Processing (NLP) learn from text collections with easily accessible information in order to find and predict knowledge patterns. This chapter explores the use of distributional analysis for terminological needs, i.e., in specialized domains. It focuses on what distributional analysis stands for, how it works, how it can be used for LSP and Terminology, and why it is useful for terminological needs.

Keywords: Distributional semantics, vector space models, specialized corpora, semantic similarity, semantic relatedness, co-occurrence analysis

Article outline

1.Introduction
2.Distributional Semantics
- 2.1Theoretical and methodological principles
- 2.2Overview of Distributional Semantic Models (DSMs)
  - 2.2.1Differences with respect to the linguistic context
  - 2.2.2Differences with respect to the level of analysis
  - 2.2.3Differences with respect to the distributional representation
3.Distributional analysis of specialized corpora
- 3.1Comparative studies applied to specialized corpora
- 3.2Terminology extraction
- 3.3Ontology building and taxonomy extraction
- 3.4Semantic relations
- 3.5Knowledge patterns and semantic frames
- 3.6Terminological variation
4.Challenges of distributional analysis and issues raised by specialized corpora
- 4.1Corpus size and data sparseness
- 4.2Multi-word terms and compositionality
- 4.3Specialized vocabulary and in-domain knowledge
5.Conclusions
Notes

Cited by (1)

Cited by one other publication

San Martín, Antonio & Catherine Trekker

2025. Discovering hyponymic knowledge patterns in English. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication

This list is based on CrossRef data as of 6 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.

Chapter 14Terminology and distributional analysis of corpora

Cited by one other publication

Chapter 14
Terminology and distributional analysis of corpora