Article published In: International Journal of Corpus Linguistics
Vol. 28:3 (2023) ► pp.344–377
Research trends in corpus linguistics
A bibliometric analysis of two decades of Scopus-indexed corpus linguistics research in arts and humanities
Published online: 14 November 2022
https://doi.org/10.1075/ijcl.21072.cro
https://doi.org/10.1075/ijcl.21072.cro
Abstract
This paper uses a bibliometric analysis to map the field of Corpus Linguistics (CL) research in arts and
humanities over the last 20 years, tracking changes in popular CL research topics, outlets, highly cited authors, and geographical
origins based on the metadata of 5,829 CL-related articles from 429 Scopus-indexed journals. Results reveal an increase in
corpus-assisted discourse studies, lexical bundles and academic writing, alongside newer topics including multilingualism and
social media. CL studies span 193 languages/dialects with a significant rise in Chinese, Russian, Spanish, and Italian CL research
over the past decade. Clusters of highly cited CL researchers are identified spanning (inter)disciplinary research areas. An
increase of CL researchers in China, Poland, South Korea, Japan, and more is evidence of the now global reach of CL research.
These findings mirror diachronic socio-cultural developments in applied linguistics and society more generally and provide
insights into what CL research might come next.
Article outline
- 1.Introduction
- 2.Bibliometrics and corpus linguistics
- 3.Method
- 3.1Dataset inclusion criteria
- 3.2Exclusion criteria
- 3.3Evaluation of sample representativeness
- 3.4Final dataset
- 3.5Analytical procedures
- RQ1: Research topics and languages
- RQ2: Authors and citations
- RQ3: Journals and geographical sources
- 4.Results
- 4.1Research topics and languages of investigation
- 4.1.1Research topics
- 4.1.2Languages of investigation
- 4.2Who are the most cited authors?
- 4.3Which journals and countries of publication are the most influential in CL research?
- 4.3.1Countries of publication
- 4.3.2Journals issuing CL research
- 4.1Research topics and languages of investigation
- 5.Discussion and conclusion
References
References (29)
Ben-David, J., & Sullivan, T. A. (1975). Sociology
of science. Annual Review of
Sociology, 1(1), 203–222.
Biber, D., Conrad, S., & Reppen, R. (1998). Corpus
Linguistics: Investigating Language Structure and Use. Cambridge University Press.
Brezina, V., & Gablasova, D. (2015). Is
there a core general vocabulary? Introducing the new general service list. Applied
Linguistics, 36(1), 1–22.
de Bellis, N. (2009). Bibliometrics
and Citation Analysis: From the Science Citation Index to
Cybermetrics. Scorecrow.
Du, Y. Q., Zhu, G. D., Cao, J., & Huang, J. Y. (2021). Research
supporting malaria control and elimination in China over four decades: A bibliometric analysis of academic articles published
in Chinese from 1980 to 2019. Malaria
Journal, 20(1), 1–12.
Ellegaard, O., & Wallin, J. A. (2015). The
bibliometric analysis of scholarly production: How great is the
impact? Scientometrics, 105(3), 1809–1831.
Ellis, R. (2021). A
short history of SLA: Where have we come from and where are we going? Language
Teaching, 54(2), 190–205.
Gabrielatos, C. (2021). If-conditionals:
Corpus-based classification and frequency distribution. ICAME
Journal, 451, 87–124.
Groos, O. V., & Pritchard, A. (1969). Documentation
notes. Journal of
Documentation, 25(4), 344–349.
Huan, C., & Guan, X. (2020). Sketching
landscapes in discourse analysis (1978–2018): A bibliometric study. Discourse
Studies, 22(6), 697–719.
Hyland, K., & Jiang, F. K. (2021). A
bibliometric study of EAP research: Who is doing what, where and when? Journal of English for
Academic Purposes, 491, 100929.
Johns, T. (1997). Contexts:
The background, development and trialling of a concordance-based CALL
program. In A. Wichmann, S. Fligelstone, T. McEnery, & G. Knowles (Eds.), Teaching
and Language
Corpora (pp. 100–115). Longman.
Kuo, H. K., & Yang, C. (2014). An
intellectual structure of activity-based costing: A co-citation analysis. Electronic
Library, 32(1), 31–46.
Leech, G. (1992). Corpora
and theories of linguistic performance. In J. Svartvik (Ed.), Directions
in Corpus Linguistics: Proceedings of Nobel
Symposium 821 (pp. 125–148). Mouton de Gruyter.
Lei, L., & Liu, D. (2018). The
research trends and contributions of System’s publications over the past four decades (1973–2017): A
bibliometric
analysis. System, 801, 1–13.
(2019). Research
trends in applied linguistics from 2005 to 2016: A bibliometric analysis and its
implications. Applied
Linguistics, 40(3), 540–561.
Liao, S., & Lei, L. (2017). What
we talk about when we talk about corpus: A bibliometric analysis of corpus-related research in linguistics
(2000–2015). Glottometrics, 381, 1–20.
Lu, X., Gamson, D. A., & Eckert, S. A. (2014). Lexical
difficulty and diversity of American elementary school reading textbooks. International Journal
of Corpus
Linguistics, 19(1), 94–117.
Özçinar, H. (2015). Mapping
teacher education domain: A document co-citation analysis from 1992 to 2012. Teaching and
Teacher
Education, 471, 42–61.
Park, H., & Nam, D. (2017). Corpus
linguistics research trends from 1997 to 2016: A co-citation analysis. Linguistic
Research, 34(3), 427–457.
R Core Team. (2021). R: A language and
environment for statistical computing (Version 4.1.1) [Computer
software]. R Foundation for Statistical Computing. [URL]
Römer, U. (2011). Corpus
research applications in second language teaching. Annual Review of Applied
Linguistics, 311, 205–225.
Shapin, S. (1995). Here
and everywhere: Sociology of scientific knowledge. Annual Review of
Sociology, 21(1), 289–321.
Tognini-Bonelli, E. (2010). Theoretical
overview of the evolution of corpus linguistics. In A. O’Keeffe & M. McCarthy (Eds.), The
Routledge Handbook of Corpus
Linguistics (pp. 14–28). Routledge.
van Eck, N., & Waltman, L. (2010). Software
survey: VOSviewer, a computer program for bibliometric
mapping. Scientometrics, 841, 523–538.
Cited by (16)
Cited by 16 other publications
Crosthwaite, Peter & Martin Schweinberger
2025. Corpora and instructed second language acquisition. In Technology and Instructed Second Language Acquisition [Language Learning & Language Teaching, 63], ► pp. 115 ff.
Dong, Jihua, Ye Liu & Louisa Buckingham
Gràcia, A., L. Padró, E. Alarcon & P. Vázquez
Karapolatgil, Ahmet Anil, Yavuz Selim Balcioglu & Irge Sener
Saeli, Hooman, Payam Rahmati & Svetlana Koltovskaia
Schmück, Hanna
2025.
Sarah Buschfeld, Patricia Ronan, Theresa Neumaier, Andreas Weilinghoff and Lisa Westermayer (eds.), Crossing boundaries through corpora: Innovative corpus approaches within and beyond linguistics (Studies in Corpus Linguistics 119). Amsterdam and Philadelphia: John Benjamins, 2024. Pp vi + 265. ISBN 9789027215949.. English Language and Linguistics ► pp. 1 ff.
Wei, Yingying
Xiao, Yan & Qiurong Zhao
2025. Review of Csomay & Crawford (2024): Doing Corpus Linguistics. Journal of Historical Pragmatics
Bi, Yude & Hua Tan
Chung, Tieu Thuy, Peter Crosthwaite, Cam Thi Hong Cao & Carolina Tavares de Carvalho
Obiajulu Umeanowai, Kingsley & Gengshen HU
Spier, Troy E.
Crosthwaite, Peter
Crosthwaite, Peter
Lusta, Amel, Özcan Demirel & Behbood Mohammadzadeh
This list is based on CrossRef data as of 12 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
