Article published In: Terminology
Vol. 22:1 (2016) ► pp.80–102
Measuring the degree of specialisation of sub-technical legal terms through corpus comparison
A domain-independent method
Published online: 19 May 2016
https://doi.org/10.1075/term.22.1.04mar
https://doi.org/10.1075/term.22.1.04mar
One of the most remarkable features of the legal English lexicon is the use of sub-technical vocabulary, that is, words frequently shared by the general and specialised fields which either retain a legal meaning in general English or acquire a specialised one in the legal context. As testing has shown, almost 50% of the terms extracted from BLaRC, an 8.85m word legal corpus, were found amongst the most frequent 2,000 word families of West’s (1953) GSL, Coxhead’s (2000) AWL or the BNC (2007), hence the relevance of this type of vocabulary in this English variety. Owing to their peculiar statistical behaviour in both contexts, it is particularly problematic to identify them and measure their termhood based on such parameters as their frequency or distribution in the general and specialised environments. This research proposes a novel termhood measuring method intended to objectively quantify this lexical phenomenon through the application of Williams’ (2001) lexical network model, which incorporates contextual information to compute the level of specialisation of sub-technical terms.
Keywords: Legal English, sub-technical terms, lexical networks, ESP, corpus linguistics
References (54)
Ahmad, Khurshid, Andrea Davies, Heather Fulford, and Monika Rogers. 1994. “What is a Term? The Semi-automatic Extraction of Terms from Text.” In Translation Studies: An Interdiscipline, ed. by Snell-Hornby, M.F. Pöchhacker, and K. Kaindl, 267–278. Amsterdam: John Benjamins.
Ananiadou, Sofia. 1988.
A Methodology for Automatic Term Recognition. PhD Thesis, University of Manchester, Institute of Science and Technology, United Kingdom.
Aronson, Alan, and Françoise-Michel Lang. 2010. “An Overview of MetaMap: Historical Perspective and Recent Advances.” Journal of American Medical Informatics Association 17 (3): 229–236.
Baker, Mona. 1988. “Sub-technical Vocabulary and the ESP Teacher: An Analysis of some Rhetorical Items in Medical Journal Articles.” Reading in a Foreign Language 4 (2): 91–105.
Barrón-Cedeño, Alberto, Gerardo Sierra, Patrick Drouin, and Sofia Ananiadou. 2009. “An Improved Automatic Term Recognition Method for Spanish.” In Proceedings of the 10th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2009), ed. by A. Gelbuck, 125–136. Berlin: Springer-Verlag. ([URL]). Accessed January 2016.
Bourigault, Didier. 1992. “Surface Grammatical Analysis for the Extraction of Terminological Noun Phrases.” In
Proceedings of the 5th International Conference on Computational Linguistics
, 977–981. Nantes, France.
Cabré, María Teresa, Rosa Estopà, and Jorge Vivaldi. 2001. “Automatic Term Detection: A Review of Current Systems.” In Recent Advances in Computational Terminology, ed. by D. Bourigault, C. Jacquemin, and M.C. L’Homme, 53–87. Amsterdam: John Benjamins.
Chung, Teresa M. 2003. “A Corpus Comparison Approach for Terminology Extraction.” Terminology 9 (2): 221–246.
Chung, Teresa M., and Paul Nation. 2003. “Technical Vocabulary in Specialised Texts.” Reading in a Foreign Language 15 (2): 103–116.
Church, Kenneth W., and Patrick Hanks. 1990. “Word Association Norms, Mutual Information, and Lexicography.” Computational Linguistics 16 (1): 22–29.
Church, Kenneth W., and William Gale. 1995. “Inverse Document Frequency IDF: A Measure of Deviations from Poisson.” In Proceedings of the Third Workshop on Very Large Corpora, ed. by D. Yarowsky and K. Church, 121–130. Cambridge: Massachusetts Institute of Technology Press.
Cowan, Ronayne. 1974. “Lexical and Syntactic Research for the Design of EFL.” TESOL Quarterly 81: 389–399.
Dagan, Ido, and Kenneth Church. 1994. “TERMIGHT: Identifying and Translating Technical Terminology.” In
Proceedings of the 4th Conference on Applied Natural Language Processing
, 34–40. Stuttgart, Germany ([URL]). Accessed January, 2016.
Daille, Beatrice. 1996. “Study and Implementation of Combined Techniques for Automatic Extraction of Terminology.” In The Balancing Act: Combining Symbolic and Statistical Approaches to Language, ed. by J.L. Klavans and P. Resnik, 29–36. Cambridge: Massachusetts Institute of Technology Press.
David, Sophie, and Pierre Plante. 1990. Termino 1.0. Research Report of Centre d’Analyse de Textes par Ordinateur. Université du Québec, Montréal.
Drouin, Patrick. 2003. “Term Extraction Using Non-technical Corpora as a Point of Leverage.” Terminology 9 (1): 99–117.
Dunning, Ted. 1993. “Accurate Methods for the Statistics of Surprise and Coincidence”. Computational Linguistics 19 (1): 61–74.
Fahmi, Ismail, Gosse Bouma, and Lonneke van der Plas. 2007. “Improving Statistical Method Using Known Terms for Automatic Term Extraction.” In Proceedings of Computational Linguistics in the Netherlands (CLIN 17), ed. by F. van Eynde, P. Dirix, I. Schuurman, and V. Vandeghinste, 1–8. Belgium: University of Leuven.
Farrell, Paul. 1990. Vocabulary in ESL: A Lexical Analysis of the English of Electronics and a Study of Semi-technical Vocabulary. Dublin: Centre for Language and Communication Studies.
Flowerdew, John. 2001. “Concordancing as Tool in Course Design.” In Small corpus Studies and ELT: Theory and Practice, ed. by M. Ghadessy, A. Henry, and R. Roseberry, 71–92. Amsterdam: John Benjamins.
Frantzi, Katerina T., and Sophia Ananiadou. 1999. “The C/NC Value Domain Independent Method for Multi-word Term Extraction.” Journal of Natural Language Processing 3 (2): 115–127.
Frantzi, Katerina, Sofia Ananiadoua, and Hideki Mima. 2000. “Automatic Recognition of Multi-Word Terms: The C-value/NC-value Method.” International Journal on Digital Libraries 3 (2): 115–130.
Geffet, Maayan, and Ido Dagan. 2005. “The Distributional Inclusion Hypotheses and Lexical Entailment.” In
Proceedings of the Annual Meeting of the ACL
, 107–114. Michigan, USA.
Heatley, Alex, and Paul Nation. 2002. Range. Computer software. Wellington, New Zealand: Victoria University of Wellington.
Jacquemin, Christian. 2001. Spotting and Discovering Terms through NLP. Cambridge: Massachusetts Institute of Technology Press.
Joslyn, Cliff, Patrick Paulson, and Karin Verspoor. 2008. “Exploiting Term Relations for Semantic Hierarchy Construction.” In
Proceedings of the International Conference of Semantic Computing IEEE
, 42–49. Santa Clara (CA), USA.
Justeson, John S., and Slava M. Katz. 1995. “Technical Terminology: Some Linguistic Properties and an Algorithm for Identification in Text.” Natural Language Engineering 1 (1): 9–27.
Kit, Chunyu, and Xiaoyue Liu. 2008. “Measuring Mono-word Termhood by Rank Difference via Corpus Comparison.” Terminology 14 (2): 204–229.
Lemay, Chantal, Marie-Claude L’Homme, and Patrick Drouin. 2005. “Two Methods for Extracting “Specific” Single-Word Terms form Specialised Corpora.” International Journal of Corpus Linguistics 10 (2): 227–255.
Loginova, Elizabeta, Anita Gojun, Helena Blancafort, María Guegan, Tatiana Gornostay, and Ulrich Heid. “Reference Lists for the Evaluation of Term Extraction Tools.” In
Proceedings of TKE 2012: Terminology and Knowledge Engineering
, 177–192. Madrid: Universidad Politécnica de Madrid. ([URL]), Accessed January 2016.
Marín, María José. 2014. “Evaluation of Five Single-word Term Recognition Methods on a Legal Corpus.” Corpora 9 (1): 83–107.
Marín, María José, and Camino Rea. 2012. “Structure and Design of the BLRC: A Legal Corpus of Judicial Decisions from the UK.” Journal of English Studies 101: 131–145.
Maynard, Diana, and Sofia Ananiadou. 2000. “TRUCKS: A Model for Automatic Multi-word Term Recognition”. Journal of Natural Language Processing 8 (1): 101–125.
Nakagawa, Hiroshi, and Tatsunori Mori. 2002. “A Simple but Powerful Automatic Term Extraction Method.” In COLING-02 on COMPUTERM.
Proceedings of the Second International Workshop on Computational Terminology
, 1–7. Taipei, Taiwan.
Nazar, Rogelio, and María Teresa Cabré. 2012. “Supervised Learning Algorithms Applied to Terminology Extraction.” In Proceedings of the 10th Terminology and Knowledge Engineering Conference TKE 2012, ed. by G. Aguado de Cea, M.C. Suárez-Figueroa, R. García-Castro, and E. Montiel-Ponsoda, 209–217. Madrid: Ontology Engineering Group, Association for Terminology and Knowledge Transfer.
Orts, María Ángeles. 2006. Aproximación al Discurso Jurídico en Inglés: Las Pólizas de Seguro Marítimo de Lloyd’s. Madrid: Edisofer.
Panzienza, Maria Teresa, Marco Pennacchiotti, and Fabio Massimo Zanzotto. 2005. “Terminology Extraction: An Analysis of Linguistic and Statistical Approaches.” Studies in Fuzziness and Soft Computing 1851: 225–279.
Park, Younja, Roy Byrd, and Branimir Boguraev. 2002. “Automatic Glossary Extraction: Beyond Terminology Association.” In
Proceedings of COLING’02 19th International Conference on Computational Linguistics
, ed. by S.C. Zeng, 1–7. Taipei, Taiwan.
Sclano, Francesco, and Paola Velardi. 2007. “A Web Application to Learn the Common Terminology of Interest Groups and Research Communities.” In Proceedings of the Conference TIA-2007, ed. by C. Engehard and R.D. Kuntz, 85–94. Grenoble: Presses Universitaires de Grenoble.
Sparck-Jones, Kathleen. 1972. “A Statistical Interpretation of Term Specificity and its Application in Retrieval.” Journal of Documentation 281: 11–21.
Trimble, Louis. 1985. English for Science & Technology: A Discourse Approach. Cambridge: Cambrige University Press.
Vivaldi, Jorge 2001.
Extracción de Candidatos a Término mediante Combinación de Estrategias Heterogéneas. PhD Thesis. Universidad Politécnica de Cataluña.
Vivaldi, Jorge, Diego Cabrera, Luis Adrián, Gerardo Sierra and María Pozzi. 2012. “Using Wikipedia to Validate the Terminology Found in a Corpus of Basic Textbooks.” In
Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12)
, 3820–3827. Instambul: Instambul Lütfi Kırdar Convention and Exhibition Centre. ([URL]). Accessed January 2016.
Wang, Karen, and Paul Nation. 2004. “Word Meaning in Academic English: Homography in the Academic Word List.” Applied Linguistics 25 (3): 291–314.
Weeds, Julie, David Weir, and Diana McCarthy. 2004. “Characterising Measures of Lexical Distributional Similarity.” In
Proceedings of Coling-04
. 1–7, Geneva, Switzerland.
Cited by (3)
Cited by three other publications
Marín, María José
2023. Automatic term recognition and legal language. In Handbook of Terminology [Handbook of Terminology, 3], ► pp. 511 ff.
Pérez, María José Marín & Ángela Almela
This list is based on CrossRef data as of 6 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
