Article published In: Explorations of morphological structure in distributional space
Edited by Melanie J. Bell, Juhani Järvikivi and Vito Pirrelli
[The Mental Lexicon 17:3] 2022
► pp. 422–457
An inquiry into the semantic transparency and productivity of German particle verbs and derivational affixation
Available under the Creative Commons Attribution (CC BY) 4.0 license.
For any use beyond this license, please contact the publisher at rights@benjamins.nl.
Published online: 23 March 2023
https://doi.org/10.1075/ml.22012.stu
https://doi.org/10.1075/ml.22012.stu
Abstract
This study addresses the relation between morphological productivity and semantic transparency. Using distributional semantics, we compare German word formation using particles with derivational word formation. We observed that derivational suffixes, but not particles, tend to make strong independent semantic contributions to their carrier words. In two-dimensional t-SNE maps, complex words show clustering by affix, but not by particle. Furthermore, the semantic vectors of suffixed words are predictable from their base words with higher accuracy than is possible for particle verbs. For particle verbs, but not affixed verbs, semantic similarity within the set of complex words correlated negatively with the number of types. Furthermore, only for particle verbs, a greater number of observed types predicted a reduced probability of observing unseen types. We propose that particle verbs primarily serve the onomasiological function of labeling, resulting in relatively idiosyncratic semantic vectors. By contrast, words sharing derivational affixes form distinct clusters in semantic space while maintaining strong and consistent semantic relations with their base words. This enables these words to serve not only as labels, but also allows them to be used with an anaphoric function in discourse.
Keywords: semantic transparency, productivity, embeddings, particle verbs, suffixation, German
Article outline
- 1.Introduction
- 2.Productivity
- 3.Word embeddings, particle verbs and affixed words
- 3.1Semantic transparency assessed with constituent vectors
- 3.2Assessing transparency with averaged within-category correlations
- 4.Geometry of semantic transparency in semantic space
- 4.1t-SNE analysis of semantic space
- 4.2t-SNE analysis of shift vectors
- 5.Assessing transparency with functions for conceptualization
- 6.Discussion
- Acknowledgements
- Notes
References
References (47)
Baayen, R. H. (1993). On frequency, transparency, and productivity. In Booij, G. E. and van Marle, J., editors, Yearbook of Morphology 1992, pages 181–208. Kluwer Academic Publishers, Dordrecht.
(2005). Data mining at the intersection of psychology and linguistics. In Cutler, A., editor, Twenty-first century psycholinguistics: Four cornerstones, pages 69–83. Erlbaum, Hillsdale, New Jersey.
Baayen, R. H., Chuang, Y.-Y., Shafaei-Bajestan, E., and Blevins, J. (2019). The discriminative lexicon: A unified computational model for the lexicon and lexical processing in comprehension and production grounded not in (de)composition but in linear discriminative learning. Complexity.
Baayen, R. H. and Lieber, R. (1991). Productivity and English derivation: a corpus-based study. Linguistics, 291:801–843.
Baayen, R. H. and Neijt, A. (1997). Productivity in context: a case study of a Dutch suffix. Linguistics, 35:565–587.
Baayen, R. H. and Renouf, A. (1996). Chronicling The Times: Productive Lexical Innovations in an English Newspaper. Language, 721:69–96.
Bojanowski, P., Grave, E., Joulin, A., and Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 51:135–146.
Bonami, O. and Paperno, D. (2018). Inflection vs. derivation in a distributional vector space. Lingue e Linguaggio, 17(2):173–195.
(2016). Construction morphology. In Hippisley, A. and Stump, G., editors, The Cambridge Handbook of Morphology, pages 424–448. Cambridge University Press, Cambridge.
Chuang, Y., Brown, D., Evans, R., and Baayen, R. H. (2022). Paradigm gaps are associated with weird “distributional semantics” properties: Russian defective nouns and their case and number paradigms. The Mental Lexicon.
Dressler, W. U., & Ladányi, M. (2000). Productivity in word formation (WF): A morphological approach. Acta Linguistica Hungarica, 471, 103–145.
Dressler, Wolfgang. (2003). Morphological Typology and First Language Acquisition: Some Mutual Challenges.
Fernández-Domínguez, Jesús. (2009). Productivity in English word-formation. An approach to N+N compounding.
Good, I. J. (1953). The population frequencies of species and the estimation of population parameters. Biometrika, 401:237–264.
Güunther, F. and Marelli, M. (2019). Enter sandman: Compound processing and semantic transparency in a compositional perspective. Journal of Experimental Psychology: Learning, Memory, and Cognition, 45(10):1872.
Kempcke, G. (1965). Die Bedeutungsgruppen der verbalen Kompositionspartikeln an-und auf-in synchronischer und diachronischer Sicht. Beiträge zur Geschichte der deutschen Sprache und Literatur, volume 87.
Kisselew, M., Padó, S., Palmer, A., and Snajder, J. (2015). Obtaining a better understanding of distributional models of german derivational morphology. In Proceedings of the 11th International Conference on Computational Semantics, pages 58–63.
Kliche, F. (2009). Zur Semantik der Partikelverben auf ab. Eine Studie im Rahmen der Diskursepräentationstheorie. PhD thesis, Master’s thesis, Universität Tübingen.
Köper, M., Schulte im Walde, S., Kisselew, M., and Padó, S. (2016). Improving zero-shot-learning for german particle verbs by using training-space restrictions and local scaling. In Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics, pages 91–96.
Krijthe, J. H. (2015). Rtsne: T-Distributed Stochastic Neighbor Embedding using Barnes-Hut Implementation. R package version 0.15.
Landauer, T. and Dumais, S. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104(2):211–240.
Lechler, A. and Roßdeutscher, A. (2009). Analysing german verb-particle constructions with’ auf ’within a drt based framework.
Lieber, R. and Baayen, R. H. (1993). Verbal prefixes in Dutch: a study in lexical conceptual structure. In Booij, G. E. and Marle, J. V., editors, Yearbook of Morphology 1993, pages 51–78. Kluwer Academic Publishers, Dordrecht.
Maaten, L. V. D. and Hinton, G. (2008). Visualizing data using t-sne. Journal of machine learning research, 91(Nov):2579–2605.
Marelli, M. and Baroni, M. (2015). Affixation in semantic space: Modeling morpheme meanings with compositional distributional semantics. Psychological Review, 122(3):485.
Möollemann, R. (2016). Implications of german word formation processes for a role and reference grammar approach to morphology. MA thesis, University of Düusseldorf.
Nikolaev, A., Chuang, Y.-Y., and Baayen, R. H. (2022). A generating model for finnish nominal inflection using distributional semantics. under revision for the Mental Lexicon.
Plag, I. (1999). Morphological productivity: structural constraints in English derivation (Topics in English Linguistics 28). Berlin & New York: Mouton de Gruyter.
Riddle, E. (1985). A historical perspective on the productivity of the suffixes -ness and -ity. In Fisiak, J., editor, Historical Semantics, Historical Word-Formation, pages 435–461. Mouton, New York.
Schreuder, R. and Baayen, R. H. (1994). Prefix-stripping re-revisited. Journal of Memory and Language, 331:357–375.
Shafaei-Bajestan, E., Moradipour-Tari, M., Uhrig, P., and Baayen, R. H. (2022a). Semantic properties of english nominal pluralization: Insights from word embeddings. arXiv preprint arXiv:2203.15424.
(2022b). Semantic properties of English nominal pluralization: Insights from word embeddings. arXiv arxiv. org/abs/ 2203. 15424v1.
Shafaei-Bajestan, E., Uhrig, P., and Baayen, R. H. (2022c). Making sense of spoken plurals. Under revision for the Mental Lexicon.
Shahmohammadi, H., Lensch, H., and Baayen, R. H. (2021). Learning zero-shot multifaceted visually grounded word embeddings via multi-task training. CoNLL 2021. arXiv preprint arXiv:2104.07500.
Shen, T. and Baayen, H. R. (2022). Productivity and semantic transparency: An exploration of word formation in Mandarin Chinese. The Mental Lexicon.
Cited by (6)
Cited by six other publications
Gayan, Pinky Moni & Arup Kumar Nath
Varvara, Rossella & Richard Huyghe
Yang, Yi & R. Harald Baayen
Lázaro, Miguel, Teresa Simón, Ainoa Escalonilla & Trinidad Ruiz
This list is based on CrossRef data as of 27 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
