Article published In: Journal of Pidgin and Creole Languages: Online-First Articles
The digital diffusion of pidgin/creoles
Langwij, kolcha, media and Large Language Models
Published online: 25 September 2025
https://doi.org/10.1075/jpcl.24025.mai
https://doi.org/10.1075/jpcl.24025.mai
Abstract
The study explores three factors in the digital spread of pidgins and creoles. The first is the writing boom on
the Internet, which has led to prestige gains for pidgin/creoles. The second is the interplay of migration, culture and the media
as boosters in the diffusion of pidgin/creoles. The third is the technologisation of pidgin/creoles, as measured by their presence
in Large Language Models (LLMs) developed for text generation and machine translation. While pidgin/creoles are typically not
among the languages that have been technologised ‘top down’ and systematically, several of them — among them Jamaican Creole and
Nigerian Pidgin — have a high profile in the digital media and therefore also a significant presence in the training data of LLMs.
Where pidgin/creoles have become informal world languages in this way, this poses new challenges for the development of standard
orthographies and codification.
Article outline
- 1.Introduction
- 2.The globalisation of Jamaican and Naijá
- 3.Artificial intelligence and Large Language Models
- 4.Conclusion
- Notes
References
References (65)
Akande, Akinmade. 2012. The
appropriation of African American Vernacular English and Jamaican Patois by Nigerian hip hop
artists. Zeitschrift für Anglistik und
Amerikanistik 60(3), 237–54.
Appadurai, Arjun. 1996. Modernity
at Large: Cultural Dimensions of
Globalization. Minneapolis: University of Minnesota Press.
Bapna, Ankur, Isaac Caswell, Julia Kreutzer, Orhan Firat, Daan van Esch, Aditya Siddhant, Mengmeng Niu, Pallavi Baljekar, Xavier Garcia, Wolfgang Macherey, Theresa Breiner, Vera Axelrod, Jason Riesa, Yuan Cao, Mia Xu Chen, Klaus Macherey, Maxim Krikun, Pidong Wang, Alexander Gutkin, Apurva Shah, Yanping Huang, Zhifeng Chen, Yonghui Wu, Macduff Hughes. 2022. Building
machine translation systems for the next thousand languages.
Baxter, Laura & Jacqueline Peters. 2013. Black
English in Toronto: A new dialect. In Alena Barysevich, Alexandra D’Arcy & David Heap (eds), Proceedings
from Methods XIV: Papers from the Fourteenth International Conference on Methods in
Dialectology, 125–138. Frankfurt: Lang.
Bender, Emily M., Timnit Gebru, Angelina McMillan-Major & Shmargaret Shmitchell. 2021. On
the dangers of stochastic parrots: Can language models be too
big? In Proceedings of the 2021 ACM Conference on Fairness,
Accountability, and Transparency, 610–623. New York: Association for Computing Machinery.
Bird, Steven. 2020. Decolonising
speech and language technology. In Proceedings of the 28th
International Conference on Computational Linguistics, Barcelona, Spain (Online), 3504–3519. International
Committee on Computational Linguistics.
Bohmann, Axel. 2015. Nobody
canna cross it: Language-ideological dimensions of hypercorrect speech in Jamaica. English
Language and
Linguistics 20(1), 129–52.
Bourdieu, Pierre. 1982. Ce
que parler veut dire: L’économie des échanges
linguistiques. Paris: Fayard.
Bernard Caron, Marine Courtin, Kim Gerdes & Sylvain Kahane. 2019. A
surface-syntactic UD treebank for Naija. In Proceedings of the 18th
International Workshop on Treebanks and Linguistic
Theories, 13–24. TLT, SyntaxFest 2019.
Cassidy, Frederic G. 1993. Short note on creole
orthography. Journal of Pidgin and Creole
Languages 8(1), 1–35.
Cassidy, Frederic G. & Robert B. LePage. 2002. Dictionary
of Jamaican English, 2nd
edn. Kingston: University of the West Indies Press.
Cooper, Carolyn. 2023. (W)uman
Tong(ue): Writing a bilingual newspaper column in ‘postcolonial’ Jamaica. Small
Axe 27(3), 215–225.
Courtin, Marine, Bernard Caron, Kim Gerdes & Sylvain Kahane. 2018. Establishing
a language by annotating a corpus: The case of Naija, a post-creole spoken in
Nigeria. In annDH 2018 Annotation in Digital
Humanities, Aug
2018, Sofia, Bulgaria, 7–11. [URL]
Denis, Derek, Vidhya Elango, Nur Sakinah Nor Kamal, Srishti Prashar & Maria Velasco. 2023. Exploring
the vowel space of Multicultural Toronto English. Journal of English
Linguistics 51(1), 30–65.
Deuber, Dagmar. 2005. Nigerian
Pidgin in Lagos: Language Contact, Variation and Change in an African
Setting. London: Battlebridge.
Eberhard, David M., Gary F. Simons & Charles D. Fennig (eds). 2025. Ethnologue:
Languages of the World, 28th ed. [URL]
Egbokhare, Francis O. 2021. The accidental lingua franca:
The paradox of the ascendancy of Nigerian Pidgin in Nigeria. In Akinmade Akande & Oladipo Salami (eds), Current
Trends in Nigerian Pidgin English: A Sociolinguistic
Perspective, 67–114. Berlin: de Gruyter.
Erdocia, Iker, Britta Schneider & Bettina Migge. 2025. Language
in the age of AI technology: From human to non-human authenticity, from public governance to privatised
assemblages. Language and Society. Ahead of
print:
Farquharson, Joseph & Byron Jones. 2014. Jamaican
slang. In Julie Coleman (ed.), Global
English Slang: Methodologies and
Perspectives, 116–25. London: Routledge.
Farquharson, Joseph. 2015. The
Black Man’s Burden? — Language and political economy in a diglossic state and
beyond. Zeitschrift für Anglistik und
Amerikanistik 63(2), 157–177.
Fox, Sue & Devyani Sharma. 2017. The
language of London and Londoners. In Dick Smakman & Patrick Heinrich (eds), Urban
Sociolinguistics: The City as a Linguistic Process and
Experience, 115–129. London: Routledge.
Gerfer, Anika. 2025. Jamaican
Creole in Global Reggae and Dancehall Performances: Language Use, Perceptions,
Attitudes. Edinburgh: Edinburgh University Press.
. 2022. Authentic
crossing? Jamaican Creole in African dancehall. In Joseph T. Farquharson, Andrea Hollington & Byron Jones (eds), Contact
Languages and
Music, 231–257. Kingston: University of the West Indies Press.
Githiora, Chege. 2018. Sheng:
Rise of a Kenyan Swahili Vernacular. Woodbridge, Surrey: James Currey.
Grieve, Jack, Sara Bartl, Matteo Fuoli, Jason Grafmiller, Weihang Huang, Alejandro Jawerbaum, Akira Murakami, Marcus Perlman, Dana Roemling & Bodo Winter. 2024. The
sociolinguistic foundations of language modeling. Frontiers Artificial
Intelligence 71:1472411.
Hall, Stuart. 1997. Representation:
Cultural Representations and Signifying Practices. Milton Keynes/London: The Open University & Sage Publications Ltd.
Heyd, Theresa & Christian Mair. 2014. From
vernacular to digital ethnolinguistic repertoire: The case of Nigerian
Pidgin. In Véronique Lacoste, Jakob Leimgruber & Thiemo Breyer (eds), Indexing
Authenticity: Sociolinguistic
Perspectives, 242–66. Berlin: Mouton de Gruyter.
Hinrichs, Lars. 2006. Codeswitching
on the Web: English and Jamaican Creole in E-mail
Communication. Amsterdam: Benjamins.
. 2011. The
sociolinguistics of diaspora: Language in the Jamaican Canadian community. Texas Linguistics
Forum 541, 1–22.
. 2014. Diasporic
mixing of World Englishes: The case of Jamaican Creole in
Toronto. In Eugene Green & Charles Meyer (eds), The
Variability of Current World
Englishes, 169–94. Berlin: de Gruyter.
Hinrichs, Lars & Jessica White-Sustaíta. 2011. Global
Englishes and the sociolinguistics of spelling: A study of Jamaican blog and email
writing. English
World-Wide 32(1), 46–73.
Honkanen, Mirka. 2020. World
Englishes on the Web: The Nigerian Diaspora in the United
States. Amsterdam: Benjamins.
Kerswill, Paul. 2014. The
objectification of ‘Jafaican’: The discoursal embedding of multicultural London English in the British
media. In Jannis Androutsopoulos (ed.), Mediatization
and Sociolinguistic
Change, 427–56. Berlin: de Gruyter.
Krings, Matthias & Onookome Okome (eds). 2013. Global
Nollywood: The Transnational Dimensions of an African Video Film Industry. Bloomington IN: Indiana University Press.
Lent, Heather, Emanuele Bugliarello, Miryam de Lhoneux, Chen Qiu & Anders Søgaard. 2021. On
language models for creoles. In Proceedings of the 25th Conference on
Computational Natural Language Learning. Online: Association for Computational
Linguistics, 58–71.
Lent, Heather, Kelechi Ogueji, Miryam de Lhoneux, Orevaoghene Ahia & Anders Søgaard. 2022. What
a creole wants, what a creole needs. In: Proceedings of the Language
Resources and Evaluation Conference. Marseille, France: European Language Resources Association. 6439–6449. url: [URL]
Lent, Heather, Kushal Tatariya, Raj Dabre, Yiyi Chen, Marcell Fekete, Esther Ploeger, Li Zhou, Ruth-Ann Armstrong, Abee Eijansantos, Catriona Malau, Hans Erik Heje, Ernests Lavrinovics, Diptesh Kanojia, Paul Belony, Marcel Bollmann, Loïc Grobol, Miryam de Lhoneux, Daniel Hershcovich, Michel DeGraff, Anders Søgaard & Johannes Bjerva. 2023. CreoleVal:
Multilingual multitask benchmarks for creoles. [URL]
Lopez, Qiuana & Lars Hinrichs. 2017. ‘C’mon,
get happy’: The commodification of linguistic stereotypes in a Volkswagen Super Bowl
commercial. Journal of English
Linguistics 45(2), 130–56.
Mair, Christian & Bridget Fonkeu. 2022. See
me see wahala? West African Pidgin in the German
diaspora. In Aloysius Ngefac, Hans-Georg Wolf & Thomas Hoffmann (eds), World
Englishes and Creole Languages Today, Vol II1: The
Bobdian Thinking and
Beyond, 99–117. Munich: LINCOM Europa.
. 2022b. Migration,
media and the emergence of pidgin and creole-based informal epicentres. World
Englishes 41(3), 414–28.
. 2025. Naijá:
A mobile pidgin in the process of becoming an informal world
Llanguage? In Carolin Patzelt, Katrin Mutz & Magnus Fischer (eds). Creole
Languages in Diasporic Contexts: Language Biographies and Plurilingual
Identities, 55–79. Trier: WVT.
. 2026, in
print. More standardisation and more diversification at the same time? A twenty-first century
English paradox. In Peter Collins & Adam Smith (eds), World-wide
Perspectives on English Usage: Into the Third
Millennium. Cambridge: CUP.
Mazzoli, Maria. 2017. Language
nativisation and ideologies in Ajégúnlè (Lagos). Language and
Communication 52(1), 88–101.
Moll, Andrea. 2015. Jamaican
Creole goes Web: Sociolinguistic Styling and Authenticity in a Digital
Yaad. Amsterdam: Benjamins.
Noble, Safiya Umoja. 2020. Algorithms of Oppression: How
Search Engines Reinforce Racism. New York: New York University Press.
Ofulue, Christine & David Esizimetor. No
date. Guide to Standard Naijá orthgraphy: An NLA harmonized writing system for common Naijá
publications. 〈[URL]〉
Patrick, Peter. 2008. British
Creole: Phonology. In Bernd Kortmann & Edgar Schneider (eds), Varieties
of English I: The British
Isles, 254–68. Berlin: de Gruyter.
Pollard, Velma. 2000. Dread
Talk: The Language of Rastafari. 2nd ed. Kingston (Jamaica) and Montreal: Canoe Press, University of the West Indies & McGill-Queen’s University Press.
Robinson, Nathaniel, Raj Dabre, Ammon Shurtz, Rasul Dent, Onenamiyi Onesi, Claire Monroc, Loïc Grobol, Hasan Muhammad, Ashi Garg, Naome Etori, Vijay Murari Tiyyala, Olanrewaju Samuel, Matthew Stutzman, Bismarck Odoom, Sanjeev Khudanpur, Stephen Richardson & Kenton Murray. 2024. Kreyòl-MT:
Building MT for Latin American, Caribbean and colonial African creole
languages. arXiv:2405.05376v2 [cs.CL] 13 May 2024
Schneider, Britta. 2022. Multilingualism
and AI: The regimentation of language in the age of digital capitalism. Signs and
Society 10(3), 362–387.
. 2008. British
Creole: Morphology and syntax. In Bernd Kortmann and Edgar Schneider (eds), Varieties
of English I: The British
Isles, 463–477. Berlin: de Gruyter.
Sterling, Marvin. 2010. Babylon
East: Performing Dancehall, Roots Reggae and Rastafari in Japan. Chapel Hill NC: Duke University Press.
Thomas, Deborah. 2004. Modern
Blackness: Nationalism, Globalization and the Politics of Culture in Jamaica. Chapel Hill, NC: Duke University Press.
Tsika, Noah. 2024. African
Media in an Age of Extraction: Nollywood
Geographies. Amsterdam: Amsterdam University Press.
Vigouroux, Cécile B. & Salikoko S. Mufwene. 2020. Preface. In Cécile B. Vigouroux & Salikoko S. Mufwene (eds), Bridging
Linguistics and
Economics, xv–xvii. Cambridge: CUP.
Westphal, Michael. 2017. Language
Variation on Jamaican
Radio. Amsterdam: Benjamins.