Article published In: Interpreting
Vol. 21:1 (2019) ► pp.62–90
A corpus for signed language interpreting research
Published online: 13 March 2019
https://doi.org/10.1075/intp.00020.weh
https://doi.org/10.1075/intp.00020.weh
Abstract
Because of the visual nature of signed language, the compilation of a signed language interpreting corpus along
the lines of spoken-language interpreting corpora has been viewed as extremely challenging, if not impossible. This study offers a
unique contribution in the construction of a lemmatized, annotated text-based corpus of signed language media interpretations,
which allows analysis of interesting features using readily-available concordance software. In this article, characteristics of
original (not interpreted) signed language corpora are explored in terms of metadata conventions, transcription and annotation, in
order to provide a framework for an interpreting corpus. Within this framework, the decisions and steps taken in the construction
of the interpreting corpus are discussed and explained.
Article outline
- 1.Introduction
- 2.Considerations in constructing a corpus
- 3.Original sign language corpora
- 3.1Multilingual and bilingual corpora
- 3.2Monolingual corpora
- 4.Construction of a signed language interpreting corpus
- 4.1Material selection
- 4.2Recording
- 4.3Metadata
- 4.4Transcription
- 4.4.1Notation system
- 4.4.2Gloss systems
- 4.5Annotation
- 4.6Corpus alignment
- 5.Conclusion
- Acknowledgements
- Notes
References
References (88)
Anthony, L. s.a. Laurence Anthony’s website [computer software]. Tokyo, Japan: Waseda University. [URL] (accessed 17 May 2018).
Arik, E. (2012). Space, time, and iconicity in Turkish Sign Language (TID). TRAMES 16 (66/61), 41, 345–358.
Auslan s.a. Auslan signbank. [URL] (accessed 16 May 2018).
Baker, M. (1993). Corpus linguistics and translation studies: Implications and applications. In M. Baker, G. Francis & E. Tognini-Bonelli (Eds.), Text and technology: In honour of John Sinclair. Amsterdam: John Benjamins, 233–250.
Baker-Shenk, C. & Cokely, D. (1981). American Sign Language: A teacher’s resource text on grammar and culture. Washington, DC: Gallaudet University Press.
Bebian, R. (1825). Mimographie ou essai d’écriture mimique, propre à regulariser le langage des sourds-muets. [URL]. (accessed 23 March 2018).
Bendazzoli, C. (2012). From international conferences to machine-readable corpora and back: An ethnographic approach to simultaneous interpreter-mediated communicative events. In F. Straniero Sergio & C. Falbo (Eds.), Breaking ground in corpus-based interpreting studies. Bern: Peter Lang, 91–118.
Bendazzoli, C. & Sandrelli, A. (2009). Corpus-based interpreting studies: Early work and future prospects. Tradumatica 7. L’applicació dels corpus linguistics a la traducció. [URL] (accessed 16 May 2016).
Bono, M., Kikuchi, K., Cibulka, P. & Osugi, Y. (2014). A colloquial corpus of Japanese Sign Language: Linguistic resources for observing sign language conversations. In LREC Proceedings 2014. [URL] (accessed 17 May 2016).
Braffort, A., Bolot, L., Chételat-Pelé, E., Choisier, A., Delorme, M., Filhol, M., Segouat, J., Verrecchia, C., Badin, F. & Devos, N. (2010). Sign language corpora for analysis, processing and evaluation. In LREC Proceedings 2010. [URL] (accessed 16 May 2016).
BSLCP (2018). British Sign Language Corpus Project. [URL] (accessed 16 May 2018).
Bungeroth, J., Stein, D., Drew, P., Ney, H., Morissey, S., Way, A. & Van Zijl, L. (2008). The ATIS sign language corpus. In LREC Proceedings 2008. [URL] (accessed 16 May 2016).
Cencini, M. & Aston, G. (2002). Resurrecting the corp(us/se): Towards an encoding standard for interpreting data. In G. Garzone & M. Viezzi (Eds.), Interpreting in the 21st century: Challenges and opportunities. Amsterdam: John Benjamins, 47–62.
Cormier, K., Fenlon, J., Johnston, T., Rentelis, R., Schembri, A., Rowley, K. & Woll, B. (2012). From corpus to lexical database to online dictionary: Issues in annotation of the BSL corpus and the development of BSL Signbank. In O. Crasborn, E. Efthimiou, E. Fotinea, T. Hanke, J. Kristoffersen & J. Mesch (Eds.), Proceedings of the 5th Workshop on the representation and processing of sign languages: Interactions between corpus and lexicon. Paris: ELRA, 7–12.
Corpus NGT (2018). Corpus NGT (Nederlands). Radboud University. [URL] (accessed 18 May 2018).
Crasborn, O. & Bank, R. (2014). An annotation scheme for the linguistic study of mouth actions in sign languages. In O. Crasborn, E. Efthimiou, S.-E. Fotinea, T. Hanke, J. Hochgesang, J. Kristoffersen & J. Mesch (Eds.), Beyond the manual channel: 6th Workshop on the Representation and Processing of Sign Languages. Reykjavik: ELRA, 23–28.
Crasborn, O. & Hanke, T. (2010). Metadata for sign language corpora. Background document for an ECHO workshop, May 8–9, 2003, Radboud University, Nijmegen. [URL] (accessed 17 May 2016).
Crasborn, O. & Zwitserlood, I. (2008). The Corpus NGT: An online corpus for professionals and laymen. In O. Crasborn, T. Hanke, E. Efthimiou, I. Zwitserlood & E. Thoutenhoofd (Eds.), Construction and exploitation of sign language corpora. 3rd Workshop on the representation and processing of sign languages. Paris: ELDA, 44–49.
ECHO s.a. European Cultural Heritage Online. Case study 4: Sign languages. Radboud University. [URL] (accessed 17 May 2018).
Fantinuoli, C. & Zanettin, F. (2014). Creating and using multilingual corpora in translation studies. In C. Fantinuoli & F. Zanettin (Eds.), New directions in corpus-based translation studies. Berlin: Language Science Press, 1–10.
Garcia, B., Fusellier-Souza, I., Sallandre, M., Boutet, D., L’Huillier, M., Cuxac, C., Courtin, C. & Balvet, A. (2009). The CREAGEST project (ANR): Linguistic and methodological issues involved in creating a corpus of French sign language (LSF) and natural gesture. Paper presented at the Sign Language Corpora: Linguistic Issues Workshop, London. [URL] (accessed 16 August 2016).
Hanke, T. (2004). HamNoSys – representing sign language data in language resources and language processing contexts. [URL] (accessed 24 August 2018).
Hoiting, N. & Slobin, D. (2002). Transcription as a tool for understanding: The Berkeley transcription system for sign language research (BTS). In G. Morgan & B. Woll (Eds.), Directions in sign language acquisition. Amsterdam: John Benjamins, 55–75.
IMDI [ISLE Meta Data Initiative] (2003a). PART 1: Metadata elements for session descriptions. Version 3.0.4. [URL] (accessed 17 May 2016).
(2003b). PART 1 C: Metadata elements for lexicon descriptions. Draft proposal 1.1c. [URL] (accessed 29 May 2013).
(2009). PART 1 B: Metadata elements for catalogue descriptions. Version 3.0.13. [URL] (accessed 17 May 2016).
Isham, W. (1994). Memory for sentence form after simultaneous interpretation: Evidence both for and against verbalization. In S. Lambert & B. Moser-Mercer (Eds.), Bridging the gap: Empirical research in simultaneous interpretation. Amsterdam: John Benjamins, 191–211.
(1995). On the relevance of signed languages to research in interpretation. Target 7 (1), 135–149.
Johnston, T. (2010). From archive to corpus: Transcription and annotation in the creation of signed language corpora. International Journal of Corpus Linguistics 15 (1), 106–131.
(2016). Auslan corpus annotation guidelines. [URL] (accessed 15 January 2019).
Johnston, T. & Napier, J. (2010). Medical Signbank: Bringing deaf people and linguists together in the process of language development. Sign Language Studies 10 (2), 258–275.
Kanda, K., Ichikawa, A., Nagashima, Y., Kato, Y., Terauchi, M., Hara, D. & Sato, M. (2002). Notation system and statistical analysis of NMS in JSL. In I. Wachsmuth & T. Sowa (Eds.), Gesture and sign language in human-computer interaction. Berlin/Heidelberg: Springer, 181–192.
Katan, D. (2004). Translating cultures: An introduction for translators, interpreters and mediators. Manchester: St Jerome.
Kellett Bidoli, C. (2004). Intercultural features of English-to-Italian sign language conference interpretation: A preliminary study for multimodal corpus analysis. Textus 171, 127–142.
(2009). Sign language: A newcomer to the interpreting forum. [URL] (accessed 16 August 2016).
(2010). Interpreting from speech to sign: Italian television news reports. The Interpreters’ Newsletter 151, 173–191.
Kennaway, J., Glauert, J. & Zwitserlood, I. (2007). Providing signed content on the internet by synthesized animation. ACM Transactions on Computer-Human Interaction 141, 1–29.
Koizumi, A., Sagawa, H. & Takeuchi, M. (2002). An annotated Japanese Sign Language corpus. In LREC Proceedings 2002. [URL] (accessed 24 August 2018).
Konrad, R. (2011). Die lexikalische Struktur der Deutschen Gebärdensprache im Spiegel empirischer Fachgebärdenlexikographie. Zur Integration der Ikonizität in ein korpusbasiertes Lexikonmodell. Tübingen: Narr.
Leeson, L. & Saeed, J. (2012). Irish Sign Language: A cognitive linguistic account. Edinburgh: Edinburgh University Press.
Mauranen, A. (2008). Universal tendencies in translation. In G. Anderman & M. Rogers (Eds.), Incorporating corpora: The linguist and the translator. Clevedon: Multilingual Matters, 32–48.
McEnery, T., Xiao, R. & Tono, Y. (2006). Corpus-based language studies: An advanced resource book. London: Routledge.
McKee, R. & McKee, D. (2009). Corpus informed lexicography: A decade of exploration. Sign Language Corpora: Linguistic Issues Workshop, DCAL, University College London, July 25th 2009. [URL] (accessed 16 July 2012).
McKee, D. & Kennedy, G. (2006). The distribution of signs in New Zealand Sign Language. Sign Language Studies 6 (4), 372–390.
Mesch, J. & Wallin, L. (2015). Gloss annotations in the Swedish Sign Language corpus. International Journal of Corpus Linguistics 20 (1), 102–120.
Metzger, M. (1999). Sign language interpreting: Deconstructing the myth of neutrality. Washington, DC: Gallaudet University Press.
Metzger, M. & Roy, C. (2011). The first three years of a three-year grant: When a research plan doesn’t go as planned. In B. Nicodemus & L. Swabey (Eds.), Advances in interpreting research: Inquiry in action. Amsterdam: John Benjamins, 59–84.
Meyer, B. (2008). Interpreting proper names: Different interventions in simultaneous and consecutive interpreting. Trans-kom 1 (1). [URL] (accessed 13 May 2016).
Neidle, C., Kegl, J., MacLaughlin, D., Bahan, B. & Lee, R. G. (2000). The syntax of American Sign Language: Functional categories and hierarchical structure. Cambridge, MA: MIT Press.
Niemants, N. (2012). The transcription of interpreting data. Interpreting 14 (2), 165–191.
Noldus (2016). Face reader online. [URL] (accessed 16 August 2016).
Nonhebel, A., Crasborn, O. & Van der Kooij, E. (2004). Sign language transcription conventions for the ECHO project. [URL] (accessed 15 May 2012).
Nuance Communications (2013). Introducing Dragon Naturally Speaking 12: Exciting new features and enhancements. [URL] (accessed 24 May 2013).
Özyürek, A., Zwitserlood, I. & Perniss, P. (2010). Locative expressions in signed languages: A view from Turkish Sign Language (TID). Linguistics 48 (5), 1111–1145.
Paabo, R., Födisch, M. & Hollman, L. (2009). Rules for Estonian Sign Language transcription. TRAMES 13 (4), 401–424.
Pichler, D., Hochgesang, J., Lillo-Martin, D. & Müller de Quadros, R. (2010). Conventions for sign and speech transcription of child bimodal bilingual corpora in ELAN. Language Acquisition and Interaction 1 (1), 11–40.
Prinetto, P., Shoaib, U. & Tiotto, G. (2011). The Italian Sign Language sign bank: Using WordNet for sign language corpus creation. In ICCIT Proceedings 2011. [URL] (accessed 24 August 2018).
(2008). Inside the black box: Can interpreting studies help the profession if access to real-life settings is denied? The Linguist 48 (2), 22–23.
Russell, D. (2002). Interpreting in legal contexts: Consecutive and simultaneous interpretation. Burtonsville, MD: Linstok Press.
Russo, M., Bendazzoli, C., Sandrelli, A. & Spinolo, N. (2012). The European Parliament Interpreting Corpus (EPIC): Implementation and developments. In F. Straniero Sergio & C. Falbo (Eds.), Breaking ground in corpus-based interpreting studies. Bern: Peter Lang, 53–90.
Russo, M., Bendazzoli, C. & Defrancq, B. (Eds.). (2018). Making way in corpus-based interpreting studies. Singapore: Springer.
Sandrelli, A. (2012). Introducing FOOTIE (Football in Europe): Simultaneous interpreting in football press conferences. In F. Straniero Sergio & C. Falbo (Eds.), Breaking ground in corpus-based interpreting studies. Bern: Peter Lang, 119–153.
Savvalidou, F. (2011). Interpreting (im)politeness strategies in a media political setting: A case study from the Greek prime ministerial TV debate as interpreted into Greek Sign Language. In L. Leeson, S. Wurm & M. Vermeerbergen (Eds.), Signed language interpreting: Preparation, practice and performance. Manchester: St Jerome, 87–109.
Seghiri, M. & Corpas Pastor, G. (2009). Virtual corpora as documentation resources: Translating travel insurance documents (English-Spanish). In A. Beeby, P. Inés & P. Sánchez-Gijón (Eds.), Corpus use and translating: Corpus use for learning to translate and learning corpus use to translate. Amsterdam: John Benjamins, 75–107.
Segouat, J. & Braffort, A. (2009). Towards categorization of sign language corpora. In Proceedings of the 2nd Workshop on Building and Using Comparable Corpora, ACL-IJCNLP, Singapore: ACL, 64–67. [URL] (accessed 14 May 2012).
Setton, R. (2002). A methodology for the analysis of interpretation corpora. In G. Garzone & M. Viezzi (Eds.), Interpreting in the 21st century: Challenges and opportunities. Amsterdam: John Benjamins, 29–46.
(2011). Corpus-based interpreting studies (CIS): Overview and prospects. In A. Kruger, K. Wallmach & J. Munday (Eds.), Corpus-based translation studies: Research and applications. London: Continuum, 33–75.
Shlesinger, M. (1998). Corpus-based interpreting as an offshoot of corpus-based translation studies. Meta 43 (4), 486–493.
(2008). Towards a definition of interpretese: An intermodal, corpus-based study. In G. Hansen, A. Chesterman & H. Gerzymisch-Arbogast (Eds.), Efforts and models in interpreting and translation research: A tribute to Daniel Gile. Amsterdam: John Benjamins, 237–253.
SignBank s.a. SignWriting dictionary library. [URL] (accessed 17 May 2018).
Slobin, D., Hoiting, N., Anthony, M., Biederman, Y., Kuntze, M., Lindert, R., Pyers, J., Thumann, H. & Weinberg, A. (2001). Sign language transcription at the level of meaning components: The Berkeley Transcription System (BTS). Sign Language & Linguistics 4 (1–2), 63–104.
Stokoe, W. (1960). Sign language structure: An outline of the visual communication systems of the American deaf. Reprinted in Journal of Deaf Studies and Deaf Education 10 (1), 2005, 3–37.
Storjohann, P. (2005). Corpus-driven versus corpus-based approach to the study of relational patterns. In Corpus Linguistics Conference Proceedings 2005. [URL] (accessed 07 May 2013).
Straniero Sergio, F. & Falbo, C. (2012). Studying interpreting through corpora: An introduction. In F. Straniero Sergio & C. Falbo (Eds.), Breaking ground in corpus-based interpreting studies. Bern: Peter Lang.
Su, H., Chiu, C. & Cheng, C. (2007). Joint optimisation of word alignment and epenthesis generation for Chinese to Taiwanese sign synthesis. IEEE. Translation Pattern Analysis and Machine Intelligence 29 (1), 28–39.
Tognini-Bonelli, E. (2001). Corpus linguistics at work. Amsterdam: John Benjamins.
Voinova, T. & Ordan, N. (2016). Narratives of community interpreters: What can we learn from using corpus-based methodology? In Bendazzoli, C. & C. Monacelli (Eds.), Addressing methodological challenges in interpreting studies research. Newcastle-upon-Tyne: Cambridge Scholars Publishing, 107–139.
Wallmach, K. (2000). Examining simultaneous interpreting norms and strategies in a South African legislative context: A pilot corpus analysis. Language Matters 31 (1), 198–221.
Wehrmeyer, E. (2004). CTS and Bible translation: A study in belling the cat? Language Matters 35 (1), 214–225.
(2015). Comprehension of television news signed language interpreters: A South African perspective. Interpreting 17 (2), 195–225.
Cited by (12)
Cited by 12 other publications
Liu, Nannan & Mariachiara Russo
2025. A value-sensitive metadata schema for interpreting corpora. Interpreting. International Journal of Research and Practice in Interpreting 27:2 ► pp. 157 ff.
Balakhonov, Vladimir & Christopher D. Mellinger
Mellinger, Christopher D.
Farooq, Uzma, Mohd Shafry Mohd Rahim, Nabeel Sabir, Amir Hussain & Adnan Abid
Monzó-Nebot, Esther & Javier Moreno-Rivero
Wehrmeyer, Ella
Wehrmeyer, Ella
Wehrmeyer, Ella
2021. Additions in simultaneous signed interpreting. Translation and Interpreting Studies 16:3 ► pp. 434 ff.
Wehrmeyer, Ella
2022. Psycholinguistic errors in signed simultaneous interpreting. Interpreting. International Journal of Research and Practice in Interpreting 24:2 ► pp. 192 ff.
Wehrmeyer, Ella
2023. Sign language corpus linguistics. In Advances in Sign Language Corpus Linguistics [Studies in Corpus Linguistics, 108], ► pp. 1 ff.
Wehrmeyer, Ella
2023. Verb classes in South African Sign Language. In Advances in Sign Language Corpus Linguistics [Studies in Corpus Linguistics, 108], ► pp. 155 ff.
Wehrmeyer, Ella
2025. Structure of simple declarative clauses in South African Sign Language. Sign Language & Linguistics 28:1 ► pp. 104 ff.
This list is based on CrossRef data as of 12 december 2025. Please note that it may not be complete. Sources presented here have been supplied by the respective publishers. Any errors therein should be reported to them.
