In:Ethical Issues in Applied Linguistics Scholarship
Edited by Peter I. De Costa, Amr Rabie-Ahmed and Carlo Cinaglia
[Research Methods in Applied Linguistics 7] 2024
► pp. 28–44
Chapter 2Corpus linguistics and ethics
Published online: 21 November 2024
https://doi.org/10.1075/rmal.7.02bro
https://doi.org/10.1075/rmal.7.02bro
Abstract
In this chapter, we explore the ethical considerations attending to research and practice in corpus
linguistics. Despite the ubiquity of ethical dilemmas in corpus construction and use, there has been scant literature
dedicated to ethical practices within the discipline. This gap is particularly pronounced given the increasing
engagement with digital and online data sources, which pose unique ethical challenges regarding issues such as
consent, privacy, and the public-private dichotomy. The chapter addresses these ethical considerations, and more
besides, from the inter-related perspectives of research participants, corpus builders, distributors, and users.
Importantly, the chapter highlights how ethical considerations are not confined to discrete stages of corpus
linguistic projects but, rather, are interwoven throughout the research lifecycle. Key issues addressed include
informed consent, participant anonymity, the ethical implications of using publicly available versus private
communications, and the responsibilities of corpus users to ensure the meaningful, truthful, and fair representation
of their findings. The chapter aims to respond to the need for more nuanced ethical guidelines that reflect the
diversity of data sources and research contexts that characterise contemporary corpus linguistics, advocating for a
reflective, case-by-case approach to ethical decision-making.
Article outline
- Introduction
- Ethical considerations in corpus linguistics
- Research participants
- Corpus builders
- Corpus distributors
- Corpus users
- Conclusions
References
References (38)
Baker, P., Hardie, A., McEnery, A., Xiao, R., Bontcheva, K., Cunningham, H., Gaizauskas, R., Hamza, O., Maynard, D., Tablan, V., Ursu, C., Jayaram, B. D., & Leisher, M. (2004). Corpus
linguistics and South Asian languages: Corpus creation and tool
development. Literary and Linguistic
Computing, 19(4), 509–524.
Baker, P., & Brookes, G. (2021). Lovely
nurses, rude receptionists, and patronising doctors: Determining the impact of gender stereotyping on patient
feedback. In J. Angouri & J. Baxter (Eds.), The
Routledge handbook of language, gender and
sexuality (pp. 559–571). Routledge.
Baker, P., Brookes, G., & Evans, C. (2019). The
language of patient feedback: A corpus linguistic study of online health
communication. Routledge.
Biber, D., Conrad, S., & Reppen, R. (1998). Corpus
linguistics: Investigating language structure and use. Cambridge University Press.
Biber, D., & Reppen, R. (2015). The
Cambridge handbook of English corpus linguistics. Cambridge University Press.
Brezina, V., Hawtin, A., & McEnery, T. (2021). The
Written British National Corpus 2014 — design and comparability. Text and
Talk, 41(5–6), 595–615.
Brookes, G. (2018). Insulin
restriction, medicalisation and the Internet: A corpus-assisted study of diabulimia discourse in online
support groups. Communication &
Medicine, 15(1), 14–27.
Brookes, G., & McEnery, T. (2020). Corpus
linguistics. In S. Adolphs & D. Knight (Eds.), The
Routledge handbook of English language and digital
humanities (pp. 378–404). Routledge.
(2021). Building
a written corpus: What are the basics? In A. O’Keeffe & M. McCarthy (Eds.), The
Routledge handbook of corpus
linguistics (pp. 35–47). Routledge.
De Costa, P.I. (2018). Toward
greater diversity and social equality in language education research. Critical
Inquiry in Language
Studies, 15(4), 302–307.
Elgesem, D. (2015). Consent
and information — Ethical considerations when conducting research on social
media. In H. Fossheim & H. Ingierd (Eds.), Internet
research
ethics (pp. 15–34). Nordic Open Access Scholarly Publishing.
Eysenbach, G., & Till, J. E. (2001). Ethical
issues in qualitative research on internet communities. British Medical
Journal, 323, 1103–1105.
Frankel, M. S., & Siang, S. (1999). Ethical
and legal aspects of human subjects research on the internet: A report of an AAAS
workshop. American Association for the Advancement of Science.
Giaxoglou, K. (2017). Reflections
on internet research ethics from language-focused research on web-based mourning: Revisiting the
private/public distinction as a language ideology of differentiation. Applied
Linguistics
Review, 8(2–3), 229–250.
Isbell, D. R., Brown, D., Chen, M., Derrick, D. J., Ghanem, R., Arvizu, M. N. G., Schnur, E., Zhang, M., & Plonsky, L. (2022). Misconduct
and questionable research practices: The ethics of quantitative data handling and reporting in applied
linguistics. The Modern Language
Journal, 106, 172–195.
Jones, L., Chałupnik, M., Mackenzie, J., & Mullany, L. (2022). ‘STFU
and start listening to how scared we are’: Resisting misogyny on Twitter via
#NotAllMen. Discourse, Context &
Media, 47, 100596.
Knight, D., & Adolphs, S. (2021). Building
a spoken corpus: What are the basics? In A. O’Keeffe & M. McCarthy (Eds.), The
Routledge handbook of corpus
linguistics (pp. 21–34). Routledge.
Leech, G. (1992). Corpora
and theories of linguistic performance. In J. Svartvik (Ed.), Directions
in corpus
linguistics (pp. 105–122). Mouton de Gruyter.
Love, R., Dembry, C., Hardie, A., Brezina, V., & McEnery, T. (2017). The
spoken BNC 2014. International Journal of Corpus
Linguistics, 22(3), 319–344.
Mackenzie, J. (2017). Identifying
informational norms in Mumsnet Talk: A reflexive-linguistic approach to internet research
ethics. Applied Linguistics
Review, 8(2–3), 293–314.
McEnery, T., & Brookes, G. (2024). Corpus
linguistics and the social sciences. Corpus Linguistics and Linguistic
Theory.
McEnery, T., & Hardie, A. (2012). Corpus
linguistics: Method, theory and practice. Cambridge University Press.
McEnery, T., & Wilson, A. (2001). Corpus
linguistics: An introduction (2nd ed.). Edinburgh University Press.
Nosek, B. A., Banaji, M. R., & Greenwald, A. G. (2002). E-Research:
Ethics, security, design, and control in psychological research on the
internet. Journal of Social
Issues, 58(1), 161–176.
Nissenbaum, H. (2010). Privacy
in Context: Technology, Policy, and the Integrity of Social
Life, Stanford: Stanford University Press.
O’Keeffe, A. and McCarthy, M. (Eds.) (2021). The
Routledge Handbook of Corpus Linguistics (2nd
edition). Routledge.
Page, R., Barton, D., Unger, J., & Zappavigna, M. (2014). Researching
language and social media: A student
guide. Routledge.
Partington, A., Duguid, A., & Taylor, C. (2013). Patterns
and meanings in discourse: Theory and practice in corpus-assisted discourse studies
(CADS). John Benjamins.
Pimple, K. D. (2002). Six
domains of research ethics: A heuristic framework for the responsible conduct of
research. Science and Engineering
Ethics, 8, 191–205.
Rüdiger, S., & Dayter, D. (2017). The
ethics of researching unlikeable subjects: Language in an online
community. Applied Linguistics
Review, 8(2–3), 251–269.
Sterling, S., & De Costa, P. I. (2018). Ethical
applied linguistics research. In A. Phakiti, P. I. De Costa, L. Plonsky, & S. Starfield (Eds.), The
Palgrave handbook of applied linguistics research
methodology (pp. 163–182). Palgrave Macmillan.
Cited by (4)
Cited by four other publications
Curry, Niall, Tony McEnery & Gavin Brookes
Rosmawati & Wander Lowie
2025. Using a multifractal analysis approach to explore (multi)fractality in L2
writing in English. In Research Methods in CDST Approaches to SLD [Research Methods in Applied Linguistics, 14], ► pp. 191 ff.
Marino, Francesca, Dacota Liska & Matt Kessler
2024. Ethical considerations for research involving computer-assisted language learning, social media, and online
environments. In Ethical Issues in Applied Linguistics Scholarship [Research Methods in Applied Linguistics, 7], ► pp. 72 ff.
[no author supplied]
This list is based on CrossRef data as of 20 november 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
