Notation Systems in Spoken Language Corpora
Table of contents
Description of the grammar of languages has traditionally been based on language use in the written mode. Not only was spoken language frequently considered inferior to written language and not worthy of description as a model for language use, due to the fleeting nature of sound it was also not possible to collect a sufficient amount of data on which to base an adequate description. Technical inventions of the twentieth century have ended the problem of non-collectability of spoken data, and altered beliefs about the quality and value of spoken language have changed the approach towards this most common kind of language use. The description of spoken language characteristics and sequential patterns nowadays are focal aspects of some areas of linguistic research. For this end, a preservation of conversational data is vitally important, and as the possiblities of technical storage are continually improving, researchers are being provided with overwhelming amounts of data. For the purpose of analysis, however, recorded data needs to be put into an analyzable format, and transcription of spoken data into some kind of written form is still the norm if linguistic analysis is to take place. Certain characteristics of spoken language set it apart from written language (Biber 1988) and impose particular demands on the process of transcription. A detailed and informative transcription of spoken language requires the inclusion of many features that are a vital part of spoken interaction but do not figure importantly – if at all – in written language, such as intonation contours, (mis)pronunciations, overlaps or truncations of utterances, emphasis, and many more.