Article published In: English World-Wide
Vol. 36:1 (2015) ► pp.1–28
Expanding horizons in the study of World Englishes with the 1.9 billion word Global Web-based English Corpus (GloWbE)
Published online: 10 February 2015
https://doi.org/10.1075/eww.36.1.01dav
https://doi.org/10.1075/eww.36.1.01dav
In this paper, we provide an overview of the new GloWbE Corpus — the Corpus of Global Web-based English. GloWbE is based on 1.9 billion words in 1.8 million web pages from 20 different English-speaking countries. Approximately 60 percent of the corpus comes from informal blogs, and the rest from a wide range of other genres and text types. Because of its large size, its architecture and interface, the corpus can be used to examine many types of variation among dialects, which might not be possible with other corpora — including variation in lexis, morphology, (medium- and low-frequency) syntactic constructions, variation in meaning, as well as discourse and its relationship to culture.
References (30)
Auer, Anita. 2006. “Precept and Practice: The Influence of Prescriptivism on the English Subjunctive”. Linguistic Insights – Studies in Language and Communication 391: 33–53.
Baker, Paul. 2011. “Times May Change but We’ll Always Have Money: A Corpus Driven Examination of Vocabulary Change in Four Diachronic Corpora”. Journal of English Linguistics 391: 65–88.
Bauer, Laurie. 1993. Manual of Information to Accompany the Wellington Corpus of Written New Zealand English. Wellington: Victoria University of Wellington.
Brinton, Laurel. 2009. The Development of ‘that said’. Paper Presented at American Association of Corpus Linguistics, University of Alberta.
Buchstaller, Isabelle. 2006. “Social Stereotypes, Personality Traits and Regional Perception Displaced: Attitudes towards the ‘New’ Quotatives in the U.K.”. Journal of Sociolinguistics 101: 362–381.
Chambers, Jack K. 1998. “Social Embedding of Changes in Progress”. Journal of English Linguistics 261: 5–36.
Collins, Peter. 2012. “Singular Agreement in There-Existentials: An Intervarietal Corpus-Based Study”. English World-Wide 331: 53–68.
Corpus of Global Web-based English. [URL].
D’Arcy, Alexandra. 2012. “The Diachrony of Quotation: Evidence from New Zealand English”. Language Variation and Change 241: 343–369.
Davies, Mark. 2009. “The 385 + Million Word Corpus of Contemporary American English (1990–2008 + ): Design, Architecture, and Linguistic Insights”. International Journal of Corpus Linguistics. 141: 159–190.
. 2011. “The Corpus of Contemporary American English as the First Reliable Monitor Corpus of English”. Literary and Linguistic Computing 251: 447–465.
. 2012. “Expanding Horizons in Historical Linguistics with the 400 Million Word Corpus of Historical American English”. Corpora 71: 121–157.
Francis, W. Nelson. 1964. “A Standard Sample of Present-Day English for Use with Digital Computers. Report to the US Office of Education in Cooperative Research Project No. E-007”.
Goldberg, Adele. 1997. “Making One’s Way Through the Data”. In Masayoshi Shibatani and Sandra Thompson, eds. Grammatical Constructions: Their Form and Meaning. Oxford: Clarendon Press, 29–53.
Greenbaum, Sidney, ed. 1996. Comparing English Worldwide: The International Corpus of English. Oxford: Oxford University Press.
Hommerberg, Charlotte, and Gunnel Tottie. 2007. “‘Try to’ or ‘try and’? Verb Complementation in British and American English”. ICAME Journal 311: 45–64.
Hundt, Marianne, Andrea Sand, and Rainer Siemund. 1998. Manual of Information to Accompany the Freiburg-LOB Corpus of British English (‘F-LOB’). Freiburg: Department of English. Albert-Ludwigs-Universität Freiburg.
Hundt, Marianne, Andrea Sand, and Paul Skandera. 1999. Manual of Information to Accompany the Freiburg-Brown Corpus of American English (‘Frown’). Freiburg: Department of English. Albert-Ludwigs-Universität Freiburg.
Hundt, Marianne, Sebastian Hoffmann, and Joybrato Mukherjee. 2012. “The Hypothetical Subjunctive in South Asian Englishes: Local Developments in the Use of a Global Construction”. English World-Wide 331: 147–164.
Johansson, Stig. 1980. “The LOB Corpus of British English Texts: Presentation and Comments”. ALLC Journal 11: 25–36.
Kachru, Braj B. 1985. “Standards, Codification and Sociolinguistic Realism: The English Language in the Outer Circle”. In Randolph Quirk and Henry Widdowson, eds. English in the World: Teaching and Learning the Language and Literatures. Cambridge: Cambridge University Press, 11–30.
Kilgarriff, Adam et al. 2014. “The Sketch Engine: Ten Years On.” Lexicography 11: 1–30. [URL] (accessed August 20, 2014).
Leech, Geoffrey, Marianne Hundt, Christian Mair, and Nicholas Smith. 2009. Change in Contemporary English: A Grammatical Study. Cambridge: Cambridge University Press.
Lindquist, Hans. 2009. Corpus Linguistics and the Description of English. Edinburgh: Edinburgh University Press.
Louw, William. 1993. “Irony in the Text or Insincerity in the Writer? The Diagnostic Potential of Semantic Prosodies”. In Mona Baker, Gill Francis, Elena Tognini-Bonelli, eds. Text and Technology: In Honour of John Sinclair.Amsterdam: John Benjamins, 157–176.
Rodríguez Louro, Celeste. 2013. “Quotatives Down Under: Be like in Cross-Generational Australian English Speech”. English World-Wide 341: 48–76.
Cited by (194)
Cited by 194 other publications
Kühn, Ramona, Jelena Mitrović & Michael Granitzer
Abidoye, Mary Ifeoluwa & Hans‐Georg Wolf
Anderson, Jemima Asabea, Clement K. I. Appah & Rachel G. A. Thompson
Ayoun, Dalila
Basile, Carmelo Alessandro
Basile, Carmelo Alessandro, Christophe Lenoble & Debra Ziegeler
Bolton, Kingsley
Cal Varela, Mario, Francisco Javier Fernández Polo & Ignacio M. Palacios Martínez
Chromý, Jan, Markéta Ceháková & James Brand
Coats, Steven, Carmelo Alessandro Basile, Cameron Morin & Robert Fuchs
2025. The YouTube corpus of Singapore English podcasts. English World-Wide. A Journal of Varieties of English 46:3 ► pp. 274 ff.
Collins, Peter & Adam Smith
Cowie, Claire
Dunn, Jonathan & Sidney Wong
Finzel, Anna
Fuchs, Robert, Xinyue Yao, Peter Collins & Adam Smith
Gnevsheva, Ksenia & Anita Szakay
Huang, Nick, Li Lin, Kunmei Han, Jia Wen Hing, Luwen Cao, Vincent Ooi & Zhiming Bao
2025. Treebanks and World Englishes. English World-Wide. A Journal of Varieties of English 46:1 ► pp. 93 ff.
Kuzman, Taja & Nikola Ljubešić
Moreno-Ortiz, Antonio
MURPHY, M. LYNNE
Neels, Jakob, Sven Leuckert & Arne Lohmann
Oladipupo, Rotimi & Aderonke Akinola
P. Romasanta, Raquel
Romasanta, Raquel P.
ROMASANTA, RAQUEL P.
Romasanta, Raquel P.
Romanova, Eugenia & Anna Oveshkova
Schoppa, Dominik Jan
Schweinberger, Martin & Kate Burridge
van Rooy, Bertus
van Rooy, Bertus
van Rooy, Bertus
Vartiainen, Turo, Marcus Callies & Aatu Liimatta
Wang, Xuelin, Shuai Bin, Iris Chen & Yuxin Hao
Zhang, Yanyan & Ruoyan Cui
Anderson, Jemima Asabea, Ebenezer Agbaglo & Rachel G. A. Thompson
Bernaisch, Tobias & Nina Funke
Bernaisch, Tobias & Sven Leuckert
2024. Short-term diachronic and variety-internal approaches to textual functionality in South Asian
Englishes. In Crossing Boundaries through Corpora [Studies in Corpus Linguistics, 119], ► pp. 192 ff.
Biermeier, Thomas
Hackert, Stephanie, Catherine Laliberté & Diana Wengler
Schlüter, Julia
2024. Do corpus data on World Englishes inspire tolerance of variation in ELT professionals?. In Crossing Boundaries through Corpora [Studies in Corpus Linguistics, 119], ► pp. 217 ff.
Tan, Siew Imm
Wolf, Hans-Georg & Frank Polzenhagen
Xie, Fang
Zhang, Xu & Benedikt Szmrecsanyi
Biri, Ylva, Laura Hekanaho & Minna Palander-Collin
Bohmann, Axel & Adesoji Babalola
2023. Verbal past inflection in Nigerian English. In New Englishes, New Methods [Varieties of English Around the World, G68], ► pp. 16 ff.
Castaño Castaño, Emilia
Dubois, Tanguy
2023. The Complexity Principle and lexical complexity in the English and Dutch
dative alternation. In Ditransitives in Germanic Languages [Studies in Germanic Linguistics, 7], ► pp. 325 ff.
Gonzales, Wilkinson Daniel Wong
2023. Broadening horizons in the diachronic and sociolinguistic study of Philippine English with the Twitter Corpus of
Philippine Englishes (TCOPE). English World-Wide. A Journal of Varieties of English 44:3 ► pp. 403 ff.
GONZALES, WILKINSON DANIEL WONG
Gonzales, Wilkinson Daniel Wong
KANWIT, MATTHEW & JUAN BERRÍOS
Keškić, Kader Baş
2023. Conceptualization of goat in West African Englishes. In Cultural Linguistics and Critical Discourse Studies [Discourse Approaches to Politics, Society and Culture, 103], ► pp. 105 ff.
Kuzman, Taja, Igor Mozetič & Nikola Ljubešić
Liu, Yang, Melissa Xiaohui Qin, Long Wang & Chao Huang
Lorenz, David
Olatoye, Temitayo
Qin, Melissa Xiaohui
Qin, Melissa Xiaohui
Shakir, Muhammad
2023. Functions of code-switching in online registers of Pakistani English. In New Englishes, New Methods [Varieties of English Around the World, G68], ► pp. 42 ff.
SHAKIR, MUHAMMAD
Shakir, Muhammad
Szmrecsanyi, Benedikt & Alexandra Engel
Unuabonah, Foluke Olayinka & Jemima Asabea Anderson
Vergaro, Carla
2023. Linguistic and pragmatic ways of committing oneself. Pragmatics & Cognition 30:1 ► pp. 120 ff.
Wilson, Guyanne
Wilson, Guyanne & Michael Westphal
2023. New Englishes new methods. In New Englishes, New Methods [Varieties of English Around the World, G68], ► pp. 1 ff.
Armando, Marjorie, Jonathan Grainger & Stephane Dufau
Buregeya, Alfred
2022. How real has the long-anticipated fast-growing influence of American English on Kenyan English been?. English World-Wide. A Journal of Varieties of English 43:2 ► pp. 192 ff.
Buregeya, Alfred
Engel, Alexandra, Jason Grafmiller, Laura Rosseel & Benedikt Szmrecsanyi
Engel, Alexandra & Benedikt Szmrecsanyi
Gnevsheva, Ksenia, Anita Szakay & Sandra Jansen
Ha, Hung Tan
Hackert, Stephanie & Diana Wengler
Leuckert, Sven
2022. Review of Sadeghpour & Sharifian (2021): Cultural Linguistics and World Englishes. English World-Wide. A Journal of Varieties of English 43:3 ► pp. 390 ff.
Leuckert, Sven
Cehan, Nadina
Collins, Peter
COLLINS, PETER
Collins, Peter
Davies, Mark
Engel, Alexandra, Jason Grafmiller, Laura Rosseel, Benedikt Szmrecsanyi & Freek Van de Velde
2021. How register-specific is probabilistic grammatical knowledge?. In Corpus-based approaches to register variation [Studies in Corpus Linguistics, 103], ► pp. 51 ff.
Hundt, Marianne, Melanie Röthlisberger & Elena Seoane
Isingoma, Bebwa
Knight, Dawn, Fernando Loizides, Steven Neale, Laurence Anthony & Irena Spasić
Knight, Dawn, Steve Morris, Laura Arman, Jennifer Needs & Mair Rees
Li, Huayong
Meylan, Stephan C. & Thomas L. Griffiths
Ooi, Vincent B. Y.
Polzenhagen, Frank
Röthlisberger, Melanie
2021. Between context and community. In Corpus-based approaches to register variation [Studies in Corpus Linguistics, 103], ► pp. 111 ff.
Röthlisberger, Melanie
2023. Exploring variation in the dative alternation across World
Englishes. In Ditransitives in Germanic Languages [Studies in Germanic Linguistics, 7], ► pp. 226 ff.
Unuabonah, Foluke O. & Rotimi O. Oladipupo
Unuabonah, Foluke Olayinka, Folajimi Oyebola & Ulrike Gut
2021. “Abeg na! we write so our comments can be posted!”. Pragmatics. Quarterly Publication of the International Pragmatics Association (IPrA) 31:3 ► pp. 455 ff.
Westphal, Michael
Yuan, Zhiqian, Chaoyang Jin, Zhaojun Chen, Suresh Chandra Satapathy, Rashmi Agrawal & Vicente García Díaz
Brunner, Thomas & Thomas Hoffmann
2020. Theway-construction in World Englishes*. English World-Wide. A Journal of Varieties of English 41:1 ► pp. 1 ff.
FitzGerald, Sarah
Giomi, Riccardo
Giomi, Riccardo
Kranich, Svenja, Elisabeth Hampel & Hanna Bruns
Leppänen, Sirpa & Saija Peuronen
Meyer, Charles F. & Gerald Nelson
Minkova, Donka & Robert Stockwell
Mukherjee, Joybrato & Tobias Bernaisch
Park, Heewoong & Jonghun Park
Peters, Pam
Schneider, Gerold, Marianne Hundt & Daniel Schreier
Skrynnikova, Inna V.
Suárez‐Gómez, Cristina, Lucía Loureiro‐Porto & Robert Fuchs
Szmrecsanyi, Benedikt & Laura Rosseel
Unuabonah, Foluke Olayinka
Unuabonah, Foluke Olayinka
Unuabonah, Foluke Olayinka & Florence Oluwaseyi Daniel
Dunn, Jonathan
Fuchs, Robert, Bertus van Rooy & Ulrike Gut
2019. Corpus-based research on English in Africa. In Corpus Linguistics and African Englishes [Studies in Corpus Linguistics, 88], ► pp. 37 ff.
Isingoma, Bebwa & Christiane Meierkord
2019. Capturing the lexicon of Ugandan English. In Corpus Linguistics and African Englishes [Studies in Corpus Linguistics, 88], ► pp. 293 ff.
McCallum, Lee
Pavlović, Vladan
2019. Massive corpora and models of cross‑cultural communication styles in Cognitive Linguitics. Review of Cognitive Linguistics 17:1 ► pp. 29 ff.
Pavlović, Vladan
2021. Massive corpora and models of cross‑cultural communication styles in Cognitive Linguistics. In Corpus Approaches to Language, Thought and Communication [Benjamins Current Topics, 119], ► pp. 29 ff.
Schmied, Josef
2019. African corpora for standards in African academic English. In Corpus Linguistics and African Englishes [Studies in Corpus Linguistics, 88], ► pp. 355 ff.
Schmied, Josef
2021. Review of Buregeya (2019): Kenyan English. English World-Wide. A Journal of Varieties of English 42:3 ► pp. 350 ff.
Schmied, Josef
Schmied, Josef
Szmrecsanyi, Benedikt, Jason Grafmiller & Laura Rosseel
Nuria Yáñez-Bouza, Emma Moore, Linda van Bergen & Willem B. Hollmann
Brato, Thorsten
Brato, Thorsten
2019. The historical corpus of English in Ghana (HiCE Ghana). In Corpus Linguistics and African Englishes [Studies in Corpus Linguistics, 88], ► pp. 119 ff.
DAVID, OANA & TEENIE MATLOCK
Gardner, Dee & Mark Davies
Gilquin, Gaëtanelle
2018. American and/or British influence on L2 Englishes – Does context tip the scale(s)?. In Modeling World Englishes [Varieties of English Around the World, G61], ► pp. 187 ff.
Grafmiller, Jason & Benedikt Szmrecsanyi
Hoffmann, Sebastian
2018. I would like to request for your attention. In Changing Structures [Studies in Language Companion Series, 195], ► pp. 171 ff.
Hundt, Marianne
2018. It is time that this(should) be studiedacross a broader range of Englishes. In Modeling World Englishes [Varieties of English Around the World, G61], ► pp. 217 ff.
Hundt, Marianne
Kirk, John & Gerald Nelson
Kopaczyk, Joanna & Jukka Tyrkkö
2018. Blogging around the world. In Applications of Pattern-driven Methods in Corpus Linguistics [Studies in Corpus Linguistics, 82], ► pp. 277 ff.
Martínez-Vázquez, Montserrat
Oveshkova, A. N.
Park, Heewoong, Sukhyun Cho & Jonghun Park
Rautionaho, Paula & Sandra C. Deshors
2018. Progressive or not progressive?. International Journal of Learner Corpus Research 4:2 ► pp. 225 ff.
Rautionaho, Paula & Sandra C. Deshors
2020. Progressive or not progressive?. In Tense and Aspect in Second Language Acquisition and Learner Corpus Research [Benjamins Current Topics, 108], ► pp. 83 ff.
Schneider, Edgar W.
Schneider, Edgar W.
Shakir, Muhammad & Dagmar Deuber
Shakir, Muhammad & Dagmar Deuber
2019. A Multidimensional Analysis of Pakistani and U.S. English blogs and columns. English World-Wide. A Journal of Varieties of English 40:1 ► pp. 1 ff.
Shakir, Muhammad & Dagmar Deuber
Shakir, Muhammad & Dagmar Deuber
2024. Code-switching in South Asian English CMC. English World-Wide. A Journal of Varieties of English 45:3 ► pp. 311 ff.
Xing, Frank Z., Danyuan Ho, Diyana Hamzah & Erik Cambria
Callies, Marcus
2017. ‘Idioms in the making’ and variation in conceptual metaphor. Cognitive Linguistic Studies 4:1 ► pp. 63 ff.
Callies, Marcus
2018. Patterns of direct transitivization and differences between British and American English. In Changing Structures [Studies in Language Companion Series, 195], ► pp. 151 ff.
Callies, Marcus
Callies, Marcus & Alexander Onysko
2017. Metaphor variation in Englishes around the world. Cognitive Linguistic Studies 4:1 ► pp. 1 ff.
Eddington, David
Güldenring, Barbara Ann
Hampton, Andrew J. & Valerie L. Shalin
Heller, Benedikt, Tobias Bernaisch & Stefan Th. Gries
Heller, Benedikt, Benedikt Szmrecsanyi & Jason Grafmiller
KOSKELA, ANU
LOUREIRO‐PORTO, LUCÍA
Molek-Kozakowska, Katarzyna & Sabina Pogorzelska
Nelson, Gerald
Schmidtke, Daniel & Victor Kuperman
Szmrecsanyi, Benedikt
Wagner, Susanne
Wagner, Susanne
2019. Whyvery goodin India might bepretty goodin North America. International Journal of Corpus Linguistics 24:4 ► pp. 445 ff.
WERNER, VALENTIN & ROBERT FUCHS
Wong, May
Deshors, Sandra C., Sandra Götz & Samantha Laporte
2016. Linguistic innovations in EFL and ESL. International Journal of Learner Corpus Research 2:2 ► pp. 131 ff.
Evans, Stephen
Fuchs, Robert
Fuchs, Robert
Horch, Stephanie
2016. Innovative conversions in South-East Asian Englishes. International Journal of Learner Corpus Research 2:2 ► pp. 278 ff.
Schneider, Gerold & Gaëtanelle Gilquin
2016. Detecting innovations in a parsed corpus of learner English. International Journal of Learner Corpus Research 2:2 ► pp. 177 ff.
Szmrecsanyi, Benedikt, Jason Grafmiller, Benedikt Heller & Melanie Röthlisberger
2016. Around the world in three alternations. English World-Wide. A Journal of Varieties of English 37:2 ► pp. 109 ff.
Biber, Douglas, Jesse Egbert & Mark Davies
[no author supplied]
[no author supplied]
This list is based on CrossRef data as of 9 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
