In:Term Variation in Specialised Corpora: Characterisation, automatic discovery and applications
Béatrice Daille
[Terminology and Lexicography Research and Practice 19] 2017
► pp. v–x
Get fulltext
This article is available free of charge.
Published online: 7 August 2017
https://doi.org/10.1075/tlrp.19.toc
https://doi.org/10.1075/tlrp.19.toc
Table of contents
Acknowledgements
xi
1Introduction
1
1.1Preliminary example
3
1.2Variants and terminological analysis
4
1.3The automatic detection of variants
6
1.4Variants and applications
7
1.5Typographical conventions
7
ICharacterisation
2Definitions
11
2.1Term
11
2.2Derivation
14
2.3Compounding
15
2.3.1Morphological compounds
16
2.3.2Border between derivation and compounding
18
2.3.3Syntagmatic compounds
19
2.3.4Border between morphological and syntagmatic compounds
20
2.4Borrowing
21
2.5Term patterns
22
2.5.1Simple term patterns
23
2.5.2Morphological compound patterns
23
2.5.3Syntagmatic compound patterns
24
2.5.4Frequency of term patterns
27
2.6Term variants
29
2.6.1The definition of variant
29
2.6.2Denominative variants
30
2.6.3Conceptual variants
30
2.7Border between terms and variants
31
3Conceptualisation of terminological variants
33
3.1Description of variants
34
3.1.1Organisation of variants
34
3.1.2Mechanisms and linguistic operations
36
3.1.3Properties of variants
37
3.2Denominative variants
37
3.2.1Synonymic substitution
38
3.2.2Simplification
40
3.2.3Exemplification
43
3.2.4Competing patterns
45
3.3Conceptual variants
46
3.3.1Expansion
46
3.3.2Anaphorical reduction
51
3.4Linguistic variants
52
3.4.1Graphics and spelling
52
3.4.2Inflection
55
3.4.3Derivation
56
3.4.4Fullback-compounding
56
3.4.5Modification
57
3.4.6Coordination, disjunction and enumeration
58
3.5Variants of register
62
3.5.1Variation of scientification/popularisation
62
3.5.2Variants of position
63
3.6Borders between categories of variants
63
3.6.1Denominative and linguistic variants
64
3.6.2Denominative and conceptual variants
64
3.6.3Conceptual and linguistic variants
64
4Semantics of conceptual variants
67
4.1Structuring terms
67
4.1.1Conceptual and semantic relations
68
4.1.2Classic semantic relations
68
4.1.3Collocation
70
4.1.4Lexical functions
70
4.2Fundamental relations between term and variant
73
4.2.1Synonymy
73
4.2.2Hierarchical relations
74
4.3Complex relations between term and variant
75
4.3.1Result
76
4.3.2Plurality
77
4.3.3Spatiality
77
4.3.4Temporality
78
4.3.5Quality
78
4.4Other relations between term and variant
79
4.4.1Predication
79
4.4.2Instance
79
IIAutomatic discovery
5Primitive exploration
83
5.1Comparable corpora
83
5.1.1Corpus
83
5.1.2Properties
84
5.1.3Collecting comparable corpora
86
5.1.4Comparability
87
5.2Comparable corpora used in this study
87
5.3Looking for variants
89
5.3.1Implementation
89
5.3.2N-gram massive data
90
5.3.3Unigrams
91
5.3.4Skip-grams
99
5.3.5Categories of variants facing data
101
5.4Comparison according to communication levels
104
5.4.1Unigrams
104
5.4.2Skip-grams
105
6Processing methods
109
6.1Linguistic-based methods
110
6.1.1Morphological analysis
110
6.1.2Syntagmatic and paradigmatic analysis
111
6.1.3Syntactic analysis
112
6.1.4Distributional analysis
113
6.2Algorithms on strings
114
6.2.1Distance computed from common substrings
115
6.2.2Edit distances
117
6.3Statistical methods
120
6.4Typology of variant occurrences
121
6.4.1Isolated variant occurrences
121
6.4.2Inter-mixed term and variant occurrences
122
6.4.3Separated term and variant occurrences
123
6.5Relationship between processing methods and types of occurrences
124
7Grammar of variants
125
7.1Specifications and properties
125
7.1.1Expressivity of the syntagmatic rules
126
7.1.2Core operations
129
7.1.3Ambiguity of the syntactic analysis
130
7.2Generic grammar of recognition of variants
132
7.2.1Competing structures
133
7.2.2Augmented/reduced structures
134
7.2.3Contextual structures
135
7.2.4Function words
135
7.2.5Ad-hoc rules
135
7.3Variant grammars for specific languages
136
7.4Cross-lingual observations
138
7.4.1Coverage
138
7.4.2Precision
140
7.5Summary of observations
146
8Synonymic variants
147
8.1Distributional analysis
147
8.1.1Modelling of the distributional methods
148
8.1.2Observations in specialised domains
151
8.2Compositional method
152
8.3Semi-compositional method
153
8.4Cross-lingual and cross-method observations
154
8.4.1Reference lists of synonyms
155
8.4.2Experimental setup parameters
157
8.4.3Evaluation measures
158
8.4.4Results
159
8.5Towards the detection of antonymic variants
165
IIIApplications and tools
9Terminology extraction
171
9.1The core of terminology extraction
172
9.2Collecting candidate terms
173
9.2.1Patterns
173
9.2.2Generic rules
173
9.2.3Borders
176
9.2.4Lexical expansion
176
9.3Filtering and sorting candidate terms
177
9.3.1Frequency
178
9.3.2Association measures
178
9.3.3Specificity measures
179
9.3.4Filtering by removing nested terms
182
9.3.5Contextual filtering
183
9.3.6Supervised learning methods
185
9.4Evaluation
186
9.4.1References
187
9.4.2Measures
189
9.5Comparing term extraction without and with variant recognition
190
9.6Experimental setting
190
9.6.1Corpora
190
9.6.2Our integrated terminology extraction
192
9.6.3Comparison protocol
194
9.6.4Maximum recall
194
9.6.5Observations with a posteriori RTL
195
9.6.6Observations with a priori RTL
199
9.7Summary of observations
201
10End-user applications and tools
205
10.1Machine-aided indexing and FASTR
205
10.2Thematic cartography and TermWatch
206
10.3TermSuite
208
10.3.1Architecture
208
10.3.2Token Regex
208
10.3.3Compost
212
10.3.4Variant grouping
216
10.3.5Ranking by termhood
219
10.3.6Performance
220
10.3.7Release
221
IVConclusions
11Term variants and their discovery
225
11.1Summary of the present study
225
11.1.1A unified typology of term variants
225
11.1.2A variety of methods for the discovery of variants
226
11.1.3A terminology-resource building application
228
11.2Remaining issues and direction for further research
229
11.2.1Semantic analysis of variations
229
11.2.2Distributional analysis at the morpheme level
230
11.2.3Recognition of other variants
231
11.3Implications for related studies
231
11.3.1Variants and paraphrases
231
11.3.2Variants and translation
232
Bibliography
235
ANotation
247
A.1Examples
247
A.2Domains
247
A.3corpus
247
BMultext categories
249
CSearch AntConc
251
C.1Parameters
251
C.2Collection of n-grams
251
C.3Results of n-grams
253
DGGRV
257
D.1French
257
D.2English
260
D.3Spanish
263
D.4German
266
D.5Russian
268
Index
271
