This book describes new methodological and technological approaches to corpus building and presents recent research based on the Norwegian Newspaper Corpus. This is a large monitor corpus of contemporary Norwegian language, compiled through daily harvesting of web newspapers. The book gives an… read more
Anna-Brita Stenström, Gisle Andersen and Ingrid Kristine Hasund
Teenage talk is fascinating, though so far teenage language has not been given the attention in linguistic research that it merits. The dearth of investigations into teenage language is due in part to under representation in language corpora. With the Bergen Corpus of London Teenage Language (COLT)… read more
This book combines theoretical work in linguistic pragmatics and sociolinguistics with empirical work based on a corpus of London adolescent conversation. It makes a general contribution to the study of pragmatic markers, as it proposes an analytical model that involves notions such as… read more
In interactive discourse we not only express propositions, but we also express different attitudes to them. That is, we communicate how our mind entertains those propositions that we express. A speaker is able to express an attitude of belief, desire, hope, doubt, fear, regret or pretence that a… read more
This chapter investigates the interaction of participants in high-stakes meetings, more specifically, the ways in which the deliberations of the Federal Open Market Committee (FOMC) in the United States are characterized by interruptions and humour. We explore the recently compiled FOMC corpus… read more
The development of terminologies for domains where these are lacking is a time-consuming and costly task. This article takes a methodological perspective and addresses a general methodological question: how can we, with limited funding, utilise to a maximal degree, existing language resources to… read more
An aspect of corpus compilation that poses a particular challenge is the question of how to transcribe orthographically units that are not part of any standardised vocabulary. Among the problematic categories we find voiced pauses, minimal response signals, interjections, certain discourse… read more
This paper focuses on English influence on Norwegian lexis and addresses the orthographic adaptation of import words, such as the change from blog to blogg, or squash to skvåsj. This adaptation can be viewed from a top-down perspective, by considering the effect of standardisation decisions made by… read more
This article describes corpus-based research methods and language processing tools that are used for the systematic study of the influence of English on Norwegian lexis. The tools are developed in connection with the Norwegian Newspaper Corpus (NNC) project. The study presents a survey of the types… read more
The Norwegian Newspaper Corpus (NNC) is an initiative to create a large monitor corpus representing contemporary Norwegian language in both its written varieties, Bokmål and Nynorsk. The corpus is compiled through daily harvesting and processing of published texts from the web edition of Norwegian… read more
Multiword expressions (MWEs) are words that co-occur so often that they are perceived as a linguistic unit. Since MWEs pervade natural language, their identification is pertinent for a range of tasks within lexicography, terminology and language technology. We apply various statistical association… read more