Article published In: Register and Professional Discourse
Edited by Shelley Staples and Gavin Brookes
[Register Studies 7:1] 2025
► pp. 11–41
A register approach to specialized word list creation
Using keywords to supplement the Contracts Word List
Published online: 2 May 2025
https://doi.org/10.1075/rs.25004.lar
https://doi.org/10.1075/rs.25004.lar
Abstract
Specialized word lists (SWLs) can help language learners acquire domain-specific vocabulary; however, there are
few such lists for legal domains despite the growing demand for resources in this area. Additionally, in list design and
construction, register is rarely considered a meaningful component of design or validation despite the fact that register is one
of the most meaningful predictors of linguistic variation, including lexical variation. The present study begins to fill these
gaps by expanding on the Contracts Word List (Hanks, E., Hashimoto, B., & Egbert, J. (2024). The
contract word list: Integral vocabulary for reading and writing English contracts. English for
Specific Purposes, 751, 37–48. ) in
creating subregister specific lists for 54 types of contracts (CWL+). SWLs within the Contracts Word List were generated relying
heavily on text-dispersion keyness analysis (Egbert, J., & Biber, D. (2019). Incorporating
text dispersion into keyword
analyses. Corpora, 14(1), 77–104. ), and their
usefulness was validated through percent coverage statistics and register analysis. This study illustrates the usefulness of
keyness analysis in word list creation and the constructive role that register can play in the design and evaluation of SWLs.
Article outline
- 1.Introduction
- 2.Literature review
- 2.1Contracts
- 2.2Word lists
- 2.3Register and the contracts word list
- 2.4Keyness analysis
- 3.The present study
- 4.Methods
- 4.1The corpus
- 4.2Word list generation
- 4.3Word list validation
- 5.Results
- 5.1Word lists
- 5.2Validation
- 5.3Situational analysis
- 6.Discussion
- 6.1Percent coverage
- 6.2Situational analysis
- 6.2.1Variation in word lists across subregisters
- 6.2.2Patterns in word lists across subregisters
- 7.Conclusion
- 7.1Implications
- 7.2Limitations and future directions
- Notes
References
References (72)
Alder, M. (2012). The
Plain Language Movement. In Lawrence M. Solan, and Peter M. Tiersma (Eds.), The
Oxford Handbook of Language and
Law (pp. 67–83). Oxford Academic.
Anesa, P. (2007). Vagueness
and precision in contracts: A close relationship. Linguistica e
Filologia, 241, 7–38.
(2019). Towards
a conceptualization of legal English as a lingua franca. International Journal of English
Linguistics, 9(6), 14–21.
Arndt, R. (2022). A
specialized vocabulary list from an original corpus of digital science resources for middle school
learners. Journal of English for Academic
Purposes, 601, 101187.
Balogh, D. (2019). The
role of genres and text selection in legal translator training. Studies in Logic, Grammar, and
Rhetoric, 58(1), 17–34.
Biber, D. (2012). Register
as a predictor of linguistic variation. Corpus Linguistics and Linguistic
Theory, 8(1), 9–37.
Biber, D., & Egbert, J. (2023). What
is register?: Accounting for linguistic and situational variation within and outside of text
varieties. Register
Studies, 5(1), 1–22.
Breeze, R. (2015). Teaching
the vocabulary of legal documents: a corpus-driven approach. ESP
Today, 3(1), 134–155.
Brezina, V., & Gablasova, D. (2017). How
to produce vocabulary lists? Issues of definition, selection and pedagogical aims: A response to Gabriele
Stein. Applied
Linguistics, 38(5), 764–767.
Brysbaert, M., New, B., & Keuleers, E. (2012). Adding
part-of-speech information to the SUBTLEX-US word frequencies. Behavior Research
Methods, 441, 991–997.
Cao, M., & Zhao, W. (2019). A
corpus-based analysis of business contract English registers. International Journal of English
Linguistics, 9(3), 158–163.
Chandler, D., & Hashimoto, B. (2024). Here-,
there-, and every where-: Exploring the role of pronominal adverbs in legal
language. Applied Corpus
Linguistics, 4(1).
Coxhead, A., & Demecheleer, M. (2018). Investigating
the technical vocabulary of plumbing. English for Specific
Purposes, 511, 84–97.
Davies, E. (2004). Register
distinctions and measures of complexity in the language of legal
contracts. In J. Gibbons, V. Prakasam, K. V. Tirumalesh, & H. Nagarajan (Eds.), Language
in the
Law, (pp. 82–99). New Delhi: Orient Longman
Egbert, J., & Biber, D. (2019). Incorporating
text dispersion into keyword
analyses. Corpora, 14(1), 77–104.
Egbert, J., Biber, D., & Gray, B. (2022). Designing
and evaluating language corpora: A practical framework for corpus
representativeness. Cambridge: Cambridge University Press.
Ferreira, F. (2003). The
misinterpretation of noncanonical sentences. Cognitive
Psychology, 47(2), 164–203.
Gardner, D., & Davies, M. (2014). A
new academic vocabulary list. Applied
Linguistics, 35(3), 305–327.
Geng, Q. (2018). Cultural
frame and translation of pronominal adverbs in legal English. International Journal of Society,
Culture &
Language, 6(2), 113–124. [URL]
Gilquin, G. & Granger, S. (2010). How
can data-driven learning be used in language teaching? In A. O’Keeffe & M. McCarthy (Eds.), The
Routledge handbook of corpus
linguistics (pp. 359–370). London: Routledge.
Grabowski, Ł. (2015). Keywords
and lexical bundles within English pharmaceutical discourse: A corpus-driven
description. English for Specific
Purposes, 381, 23–33.
Green, C., & Lambert, J. (2018). Advancing
disciplinary literacy through English for academic purposes: Collocations and word families for eight secondary
subjects. Journal of English for Academic
Purposes, 351, 105–115.
Greene, J. W., & Coxhead, A. J. (2015). Academic
vocabulary for middle school students: Research-based lists and strategies for key content
areas. Newburyport: Brookes Publishing.
Goźdź-Roszkowski, S. (2011). Patterns
of linguistic variation in American legal
English. Lausanne: Peter Lang.
Goulart, L., Gray, B., Staples, S., Black, A., Shelton, A., Biber, D., Egbert, J., & Wizner, S. (2020). Linguistic
perspectives on register. Annual Review of
Linguistics, 61, 435–455.
Hanks, E., Hashimoto, B., & Egbert, J. (2024). The
contract word list: Integral vocabulary for reading and writing English contracts. English for
Specific Purposes, 751, 37–48.
Hart, O., & Holstrom, B. (1987). The
theory of contracts. In T. F. Bewley (Ed.), Advances
in economic
theory (pp. 294–351). Cambridge: Cambridge University Press.
Hashimoto, B. J. (2021). Is
frequency enough?: The frequency model in vocabulary size testing. Language Assessment
Quarterly, 18(2), 171–187.
Hashimoto, B., & Egbert, J. (2019). More
than frequency? Exploring predictors of word difficulty for second language learners. Language
Learning, 69(4), 839–872.
Hessler, J. B., Brieber, D., Egle, J., Mandler, G., & Jahn, T. (2019). Applying
psycholinguistic evidence to the construction of a new test of verbal memory in late-life cognitive decline: The Auditory
Wordlist Learning
Test. Assessment, 26(4), 743–755.
Hill, C. A. (2001). Why
contracts are written in legalese. Chicago-Kent Law
Review, 77(59), 59–86. [URL]
Kamrotov, M., Talalakina, E., & Stukal, D. (2022). Technical
vocabulary in languages for special purposes: The corpus-based Russian economics word
list. Lingua, 2731, 103326.
Kankaanranta, A., & Planken, B. (2010). Belf
competence as business knowledge of internationally operating business professionals. The
Journal of Business
Communication, 47(4), 380–407.
Kilgarriff, A. (1997). Using
word frequency lists to measure corpus homogeneity and similarity between corpora. Fifth
Workshop on Very Large Corpora. Presented at
the VLC 1997. Retrieved
from [URL]
Kurzon, D. (1997). ‘Legal
language’: Varieties, genres, registers, discourses. International Journal of Applied
Linguistics, 7(2), 119–139.
Kyle, K., Crossley, S., & Berger, C. (2018). The
tool for the automatic analysis of lexical sophistication (TAALES): version 2.0. Behavior
Research
Methods, 501, 1030–1046.
Laso, N. J., & Salazar, D. (2013). Collocations,
lexical bundles and SciE-Lex: A review of corpus research on multiword unites of
meaning. In I. Verdaguer, N. J. Laso, & D. Salazar (Eds.), Biomedical
English: A corpus-based
approach (pp. 1–20). Amsterdam: John Benjamins.
Lei, L., & Liu, D. (2016). A
new medical academic word list: A corpus-based study with enhanced methodology. Journal of
English for Academic
Purposes, 221, 42–53.
McCrobie, D. (1986). An
analysis of acronyms in written text. Proceedings of the Human Factors Society Annual
Meeting, 30(9), 936–940.
Martinez, E., Mollica, F., & Gibson, E. (2022). Poor
writing, not specialized concepts, drives processing difficulty in legal
language. Cognition, 2241, 1–7.
Mattila, H. E. S. (2016). Comparative
legal linguistics: Language of law, Latin and modern lingua Francas (2nd
ed.). London: Routledge.
Marotta-Wurgler, F. (2007). What’s
in a standard form contract? An empirical analysis of software license agreements. Journal of
Empirical Legal
Studies, 4(4), 677–713.
Mudraya, O. (2006). Engineering
English: A lexical frequency instructional model. English for Specific
Purposes, 25(2), 235–256.
Mroczyńska, K. (2020). A
dictionary of legal English collocations as an aid for mastering the legal English
genre. Linguistics Beyond and Within
(LingBaW), 61, 130–141.
Nation, I. S. P. (2016). Making
and using word lists for language learning and testing. John Benjamins Publishing Company.
Paquot, M. (2007). Towards
a productively-oriented academic word list. In J. Walinski, K. Kredens, & S. Gozdz-Roszkowski (Eds.), Practical
applications in language and
computers (pp. 127–140). Lausanne: Peter Lang.
Pojanapunya, P., & Watson Todd, R. (2018). Log-likelihood
and odds ratio: Keyness statistics for different purposes of keyword analysis. Corpus
Linguistics and Linguistic
Theory, 14(1), 133–167.
Qi, P., Zhang, Y., Zhang, Y., Bolton, J., & Manning, C. D. (2020). Stanza:
A python natural language processing toolkit for many human languages. Proceedings of the 58th
Annual Meeting of the Association for Computational Linguistics: System
Demonstrations, 101–108. Online: Association for Computational Linguistics.
Radović, T., & Manzey, D. (2019). The
impact of a mnemonic acronym on learning and performing a procedural task and its resilience toward
interruptions. Frontiers in
Psychology, 101, 2522.
Rayson, P. (2003). Matrix:
A statistical method and software tool for linguistic analysis through corpus
comparison [Unpublished doctoral thesis]. Lancaster University.
Rousseau, D. M., & Parks, J. M. (1993). The
contracts of individuals and organizations. Research in Organizational
Behavior, 151, 1–43.
Roxenhall, T., & Ghauri, P. (2004). Use
of the written contract in long-lasting business relationships. Industrial Marketing
Management, 33(3), 261–268.
Römer, U. (2009). The
inseparability of lexis and grammar: Corpus linguistic perspectives. Annual Review of Cognitive
Linguistics, 7(1), 140–162.
Sanden, G. R., & Kankaanranta, A. (2018). “English
is an unwritten rule here”: Non-formalised language policies in multinational
corporations. Corporate Communications: An International
Journal, 23(4), 544–566.
Scott, M., & Tribble, C. (2006). Textual
patterns: Key words and corpus analysis in language
education (Vol. 221). Philadelphia: John Benjamins Publishing.
Shi, P. (2022). On
the linguistic features of business contracts. Presented at the 2022 8th
International Conference on Humanities and Social Science Research (ICHSSR 2022), Chongqing,
China.
Sönning, L. (2024). Evaluation
of keyness metrics: Performance and reliability. Corpus Linguistics and Linguistic
Theory, 20(2), 263–288.
Tenedero, P. P. (2015). 2
Linguistic analysis of trading agreements: Insights for plain writing in Philippine
contracts. Philippine Journal of
Linguistics, 461, 1–13.
Townley, A. (2021). The
use of discourse maps to teach contract negotiation communicative practices. Business and
Professional Communication
Quarterly, 84(1), 5–30.
Wang, J., Liang, S. L., & Ge, G. C. (2008). Establishment
of a medical academic word list. English for Specific
Purposes, 27(4), 442–458.
Ward, J. (2009). A
basic engineering English word list for less proficient foundation engineering
undergraduates. English for Specific
Purposes, 28(3), 170–182.
Watson Todd, R. (2017). An
opaque engineering word list: Which words should a teacher focus on?. English for Specific
Purposes, 451, 31–39.
Webb, S. A., & Sasao, Y. (2013). New
directions in vocabulary testing. RELC
Journal, 44(3), 263–277.
Witman, P. (2005). The
art and science of non-disclosure agreements. Communications of the Association for Information
Systems, 16(11), 260–269.
