In:Norms and Conventions in the History of English
Edited by Birte Bös and Claudia Claridge
[Current Issues in Linguistic Theory 347] 2019
► pp. 149–166
Testing a stylometric tool in the study of Middle English documentary texts
Published online: 27 May 2019
https://doi.org/10.1075/cilt.347.08mak
https://doi.org/10.1075/cilt.347.08mak
Abstract
This article is about testing Stylo, a stylometric script for R, in the study of Middle English documents. The main aim of the paper is to be able to discriminate between Middle English documents and document groups with the help of an automatic classification script. The basic assumption is that stylometric tools, and Stylo in particular, may provide new insights into affinity between Middle English texts. In other words, stylometric tools in general help to identify text groups the members of which have something in common in their parentage; Stylo for R is no exception in this.
Article outline
- 1.Introduction
- 2.Theoretical background
- 2.1N-grams in the analysis of Middle English
- 3.Material: A Corpus of Middle English Local Documents
- 4.Method
- 4.1Stylo and Middle English
- 5.Analysis
- 5.1One genre: Letters in MELD
- 5.2Two genres: Letters and wills
- 5.3Several genres simultaneously
- 6.Discussion
- 7.Conclusion
Notes References
References (24)
Alexis, Antonia, Craig, Hugh, and Elliot, Jack (2014). Language chunking, data sparseness, and the value of a long marker list: explorations with word n-grams and authorial attribution. Literary and Linguistic Computing, 29(2): 147–63. doi:
Eder, Maciej. (2011). Style-Markers in Authorship Attribution A Cross-Language Study of the Authorial Fingerprint. Studies in Polish Linguistics, 6, 99–114.
Eder, Maciej, Kestemont, Mike and Rybicki, Jan (2013). Stylometry with R: a suite of tools. Digital Humanities 2013: Conference Abstracts. Lincoln: University of Nebraska-Lincoln, pp. 487–89. Available at [URL]. Accessed Feb 12, 2019.
Eder, Maciej. (2015a). Visualization in Stylometry: Cluster Analysis Using Networks. Digital Scholarship in the Humanities, 1–15. <doi: >. Accessed: Oct 10, 2017.
Eder, Maciej. (2015b). Taking stylometry to the limits: Benchmark study on 5,281 texts from Patrologia Latina. [Online]. Digital Humanities 2015: Conference Abstracts. Available at: <[URL]>. Accessed: 02/2018.
Eder, Maciej, Rybicki, Jan, & Kestemont, Mike. (2016). Stylometry with R: A Package for Computational Text Analysis. [Online.] The R Journal, 8(1), 1–15. <[URL]>. Accessed: Oct 10, 2017.
Eder, Maciej, Rybicki, Jan, & Kestemont, Mike. (2014). ‘Stylo’: a package for stylometric analyses. Computational Stylistics Group. [Online]. Available at <[URL]>. Accessed Dec 10, 2016.
Embleton, Sheila, Uritescu, Dorin, & Wheeler, Eric. (2009). The Stability of Multidimensional Scaling over Large Data Sets: Evidence from the Digitized Atlas of Finnish. In Eva Havu, Mervi Helkkula & Ulla Tuomarla (Eds.), Mélanges en l’honneur de Juhani Härmä, (Mémoires de la Société Néophilologique de Helsinki), 101–108.
Evans, Mel. (2016). Tudor women writing: Multimodal style and identity in the English letters and prose of Queen Katherine Parr and Princess Elizabeth. [Online]. The Pragmatics and Stylistics of Identity Construction and Characterisation, (Studies in Variation, Contacts and Change in English 17). <[URL]>. Accessed: Oct 10, 2017.
Hoover, David L. (2002). Frequent Word Sequences and Statistical Stylistics. Literary and Linguistic Computing. 17(2), pp. 157–180. doi:
Hoover, David L. (2003). Another perspective on vocabulary richness. Computers and the Humanities, 37(2), pp. 151–178.
Hoover, David L. (2012). The Tutor’s story: A case study of mixed authorship. English Studies, 93(3), pp. 324–339. doi:
Jensen, Vibeke. (2010). Studies in the Medieval Dialect Materials of the West Riding of Yorkshire. Diss. University of Stavanger.
Juola, Patrick (2008). Authorship attribution. Foundations and Trends in Information Retrieval. 1(3), pp. 233–334. doi:
Kestemont, Mike. (2012). Stylometry for Medieval Authorship Studies: An Application to Rhyme Words. Digital Philology 1(1), 42–72.
Kestemont, Mike, Moens, Sara, & Deploige, Jeroen. (2015). Collaborative authorship in the twelfth century: A stylometric study of Hildegard of Bingen and Guibert of Gembloux. [Online]. Digital Scholarship in the Humanities, 30(2), 199–224. doi: . Accessed: 02/2018.
McIntosh, Angus, Samuels, M. L., & Benskin, MichaelLALME = McIntosh, Angus, Samuels, M. L., & Benskin, Michael. (1986). A Linguistic Atlas of Late Mediaeval English. 4 vols. Aberdeen: Aberdeen University Press.
McColly, William B., & Weier, Dennis. (1983). Literary Attribution and Likelihood-Ratio Tests: The Case of the Middle English Pearl Poems. Computers and the Humanities, 17, 65–75.
MELD = The Middle English Local Documents Corpus, version 2015.1. September 2015, University of Stavanger.
Stamatatos, Efstathios (2009). A Survey of Modern Authorship Attribution Methods. Journal of the American Society for Information Science and Technology, 60(3), pp. 538–556. doi:
Stenroos, Merja, & Mäkinen, Martti. (2011). A defiant gentleman or “the strengest thiefe of Wales”: reinterpreting the politics in a medieval correspondence. In Andreas Jucker & Päivi Pahta (Eds.), Communicating Early English Manuscripts (Studies in English Language) (83–101). Cambridge: CUP..
Stenroos, Merja & Thengs, Kjetil V. (2012). Two Staffordshires: real and linguistic space in the study of Late Middle English dialects. In Jukka Tyrkkö, Matti Kilpiö, Terttu Nevalainen & Matti Rissanen (Eds.), Outposts of Historical Corpus Linguistics: From the Helsinki Corpus to a Proliferation of Resources. (Studies in Variation, Contacts and Change in English 10), Helsinki: VARIENG. [Online]. Available at <[URL]>. Accessed Dec 10, 2016.
Tanguy, Ludovic, Urieli, Aassaf, Calderone, Basilio, Hathout, Nabil, & Sajous, Franck. (2014). A multitude of linguistically-rich features for authorship attribution [Online]. In V. Petras, P. Forner, P. Clough & N. Ferro (Eds.), CLEF2011 Working Notes; Working Notes for CLEF 2011 Conference, Amsterdam, The Netherlands, September 19–22, 2011. (CEUR-WS, 1177). <[URL]>. Accessed: 10/2017.
