In:Mathematical Modelling in Linguistics and Text Analysis: Theory and applications
Edited by Adam Pawłowski, Sheila Embleton, Jan Mačutek and Aris Xanthos
[Current Issues in Linguistic Theory 370] 2025
► pp. 128–137
A comparative analysis of stylometry and authorship attribution in a creole and non-creole language
Published online: 13 October 2025
https://doi.org/10.1075/cilt.370.11juo
https://doi.org/10.1075/cilt.370.11juo
Abstract
Stylometry, the computational study of writing style, has proven itself for answering questions of
authorship in a wide variety of languages. However, previous research has focused on non-creole languages. Our previous work
presented an analysis of 100 parliamentary speeches in seselwa and showed performance far above chance. We extend this
analysis with a comparable corpus of standard (Québécois) French. No advantage was shown for either language. We conclude that
there is no less complexity and stylistic variation in creole languages.
Article outline
- 1.Introduction
- 2.Background
- 2.1Stylometry
- 2.2Creole languages
- 2.3Seselwa
- 3.Materials and methods
- 3.1Materials
- 3.2Methods
- 4.Results
- 5.Discussion
- 6.Conclusions and future work
References
References (20)
Ainsworth, Janet & Patrick Juola. 2019. Who
wrote this?: Modern forensic authorship analysis as a model for valid forensic
science. Washington University Law
Review 96(5). 1161–1189.
Binongo, J. N. G. 2003. Who
wrote the 15th book of Oz? An application of multivariate analysis to authorship
attribution. CHANCE 16(2). 9–17.
Coulthard, Malcolm. 2004. Author
identification, idiolect, and linguistic uniqueness. Applied
Linguistics 25(4). 431–447.
DeGraff, Michel. 2005. Linguists’
most dangerous myth: The fallacy of creole exceptionalism. Language in
Society 34(4). 533–591.
Jourdain, C. 1991. Pidgins
and Creoles: The Blurring of Categories. Annual Review of
Anthropology, Vol. 20, pp. 187–209.
Juola, Patrick. 2008. Authorship
attribution. Foundations and Trends® in Information
Retrieval 25(3). 233–334.
Juola, Patrick & Alexander J. Napolitano Jawerbaum. 2022. Stylometric
authorship attribution in Seychellois
Creole. DH_BUDAPEST_2022, Budapest, Hungary.
Juola, Patrick, John Sofko & Patrick Brennan. 2006. A
prototype for authorship attribution studies. Digital Scholarship in the
Humanities 21(2). 169–178.
McMenamin, Gerald R. 2011. Declaration of Gerald R.
McMenamin in support of defendants’ motion for expedited discovery, Paul D. Ceglia
v. Mark Elliott Zuckerberg and Facebook,
Inc., United States District Court, Western District of New York, filed June 2,
2011; Exhibit B, p2 ([URL])
McWhorter, John H. 1998. Identifying the Creole
prototype: Vindicating a typological
class. Language 74(4). 788–818.
Mosteller, Frederick & David L. Wallace. 1963. Inference
in an authorship problem. Journal of the American Statistical
Association 58(302). 275–309.
Plag, Ingo. 2011. Pidgins
and creoles. In Manfred Pienemann & Jörg-U. Keßler (eds.), Studying
processability
theory, 106–120. Amsterdam: Benjamins.
Robinson, Stuart. 2008. Why
pidgin and creole linguistics needs the statistician: Vocabulary size in a Tok Pisin
corpus. Journal of Pidgin and Creole
Languages 23(1). 141–146.
