Article published In: Studies in Language
Vol. 46:4 (2022) ► pp.753–792
Derivation predicting inflection
A quantitative study of the relation between derivational history and inflectional behavior in Latin
Published online: 21 January 2022
https://doi.org/10.1075/sl.21002.bon
https://doi.org/10.1075/sl.21002.bon
Abstract
In this paper, we investigate the value of derivational information in predicting the inflectional behavior of
lexemes. We focus on Latin, for which large-scale data on both inflection and derivation are easily available. We train boosting
tree classifiers to predict the inflection class of verbs and nouns with and without different pieces of derivational information.
For verbs, we also model inflectional behavior in a word-based fashion, training the same type of classifier to predict wordforms
given knowledge of other wordforms of the same lexemes. We find that derivational information is indeed helpful, and document an
asymmetry between the beginning and the end of words, in that the final element in a word is highly predictive, while prefixes
prove to be uninformative. The results obtained with the word-based methodology also allow for a finer-grained description of the
behavior of different pairs of cells.
Keywords: morphology, derivation, inflection, Latin
Article outline
- 1.Introduction
- 2.Data collection and annotation
- 3.Predicting inflection classes
- 3.1Methodology
- 3.2Predicting the conjugation of verbs
- 3.3Predicting the declension of nouns
- 4.A word-based alternative: Predicting verb forms
- 4.1Rationale
- 4.2The information-theoretic approach to the PCFP
- 4.3The PCFP as a classification problem
- 4.4Results
- 5.Conclusions
- Acknowledgements
- Notes
- Appendix
References
References (38)
Ackerman, Farrell, James P. Blevins & Robert Malouf. 2009. Parts
and wholes: Patterns of relatedness in complex morphological systems and why they
matter. In James P. Blevins & Juliette Blevins (eds.), Analogy
in grammar: Form and
acquisition, 54–82. Oxford: Oxford University Press.
Ackerman, Farrell & Robert Malouf. 2013. Morphological
organization: The low conditional entropy
conjecture. Language 891. 429–464.
Albright, Adam C. & Bruce P. Hayes. 2003. Rules
vs. analogy in English past tenses: A computational/experimental
study. Cognition 901. 119–161.
Beniamine, Sacha. 2018. Classifications
flexionnelles. Étude quantitative des structures de
paradigmes. Paris: Université Sorbonne Paris Cité-Université Paris Diderot PhD dissertation.
Beniamine, Sacha, Olivier Bonami & Benoît Sagot. 2017. Inferring
inflection classes with description length. Journal of Language
Modelling 5(3). 465–525.
Bonami, Olivier & S. Beniamine. 2016. Joint
predictiveness in inflectional paradigms. Word
Structure 9(2). 156–182.
Bonami, Olivier & Gilles Boyé. 2006. Deriving
inflectional irregularity. In Stefan Müller (ed.), Proceedings
of the 13th International Conference on Head Driven Phrase Structure Grammar,
Varna, 361–380. Stanford, CA: CSLI Publications. [URL]
. 2014. De
formes en thèmes. In Florence Villoing, Sarah Leroy & Sophie David (eds.), Foisonnements
morphologiques. Études en hommage à Françoise
Kerleroux, 17–45. Paris: Presses Universitaires de Paris-Ouest.
Boyé, Gilles & Gauvain Schalchli. 2019. Realistic
data and paradigms: The paradigm cell finding
problem. Morphology 291. 199–248.
Brown, Dunstan, Marina Chumakina & Greville G. Corbett. 2013. Canonical
morphology and syntax. Oxford: Oxford University Press.
Brown, Dunstan & Andrew Hippisley. 2012. Network
morphology: A defaults based theory of word
structure. Cambridge: Cambridge University Press.
Budassi, Marco & Eleonora Litta. 2017. In
trouble with the rules. Theoretical issues raised by the insertion of -sc-verbs into Word Formation
Latin. In Proceedings of the Workshop on Resources and Tools for
Derivational Morphology (DeriMo), 15–26.
Bybee, Joan. 1995. Regular
morphology and the lexicon. Language and Cognitive
Processes 10(5). 425–455.
Corbett, Greville G. 2005. The canonical approach in
typology. In Zygmunt Frajzyngier, Adam Hodges & David S. Rood (eds.), Linguistic
diversity and language
theories, 25–49. Amsterdam: John Benjamins.
2009. Canonical inflectional
classes. In Fabio Montermini, Gilles Boyé & Jesse Tseng (eds.), Selected
proceedings of the 6th Décembrettes: Morphology in
Bordeaux, 1–11. Somerville: Cascadilla Proceedings Project.
Delatte, Louis, Étienne Evrard, Suzanne Govaerts & Joseph Denooz. 1981. Dictionnaire
fréquentiel et index inverse de la langue
latine. Liège: LASLA.
Dressler, Wolfgang U. 2002. Latin inflection
classes. In A. Machtelt Bolkestein, Caroline H. M. Kroon, Harm Pinkster, H. Wim Remmelink & Rodie Risselada (eds.), Theory
and description in Latin
linguistics, 91–110. Leiden: Brill.
Friedman, Jerome H. 2001. Greedy function approximation: A
gradient boosting machine. Annals of
statistics, 1189–1232.
Guzmán Naranjo, Matías. 2019. Analogical
classification in formal grammar. Berlin: Language Science Press.
. 2020. Analogy,
complexity and predictability in the Russian nominal inflection
system. Morphology 30(3). 219–262.
Guzmán Naranjo, Matías & Olivier Bonami. 2021. Overabundance
and inflectional classification: quantitative evidence from
Czech. Glossa 61.
Litta, Eleonora & Marco Passarotti. 2020. (When)
inflection needs derivation: A word formation lexicon for
Latin. In Nigel Holmes, Marijke Ottink, Josine Schrickx & Maria Selig (eds.), Lemmata
Linguistica Latina. Vol. 11: Words and
Sounds, 224–239. Berlin: De Gruyter Mouton.
Malouf, Robert. 2017. Abstractive
morphological learning with a recurrent neural
network. Morphology 27(4). 431–458.
Mason, Llew, Jonathan Baxter, Peter L. Bartlett & Marcus R. Frean. 2000. Boosting
algorithms as gradient descent. In Advances in Neural Information
Processing Systems, 512–518.
McGillivray, Barbara & Adam Kilgarriff. 2013. Tools
for historical corpus research, and a corpus of Latin. New Methods in Historical Corpus
Linguistics (3). 247–257.
Passarotti, Marco, Marco Budassi, Eleonora Litta & Paolo Ruffolo. 2017. The
Lemlat 3.0 package for morphological analysis of Latin. In Gerlof Bouma & Yvonne Adesam (eds.), Proceedings
of the NoDaLiDa 2017 Workshop on Processing Historical
Language, 24–31. Gothenburg: Linköping University Electronic Press.
Pedregosa, Fabian, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot & Édouard Duchesnay. 2011. Scikit-learn:
Machine learning in Python. Journal of Machine Learning
Research 121. 2825–2830.
Pellegrini, Matteo. 2020. Predictability
in Latin inflection. An entropy-based
approach. Bergamo/Pavia: Università di Bergamo-Università di Pavia PhD dissertation.
. 2021. Patterns
of interpredictability and principal parts in Latin verb paradigms: An entropy-based
approach. Journal of Latin
Linguistics 19(2). 195–221.
Pellegrini, Matteo & Marco Passarotti. 2018. LatInfLexi:
An inflected lexicon of Latin verbs. In Proceedings of the Fifth
Italian Conference on Computational Linguistics (CLiC-it 2018).
Plaster, Keith, Maria Polinsky & Boris Harizanov. 2013. Noun
classes grow on trees. In Balthasar Bickel, Lenore A. Grenoble, David A. Peterson & Alan Timberlake (eds.), Language
typology and historical contingency: In honor of Johanna
Nichols, 153–170. Amsterdam: John Benjamins.
Stump, Gregory T. 2001. Inflectional morphology: A theory of
paradigm structure. Cambridge: Cambridge University Press.
Cited by (5)
Cited by five other publications
Divjak, Dagmar, Irene Testini & Petar Milin
Beniamine, Sacha & Olivier Bonami
Naranjo, Matías Guzmán & Olivier Bonami
Pellegrini, Matteo
This list is based on CrossRef data as of 2 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
