In:Exploring Language and Society with Big Data: Parliamentary discourse across time and space
Edited by Minna Korhonen, Haidee Kotze and Jukka Tyrkkö
[Studies in Corpus Linguistics 111] 2023
► pp. 373–379
Index
Published online: 13 November 2023
https://doi.org/10.1075/scl.111.index
https://doi.org/10.1075/scl.111.index
A
- accessibility
- of data,1–2
- of large datasets,8
- of parliamentary discourse,338
- of the Hansard Corpus,24
- accuracy of speeches in British Hansard,33
- adjectival modification patterns,167–168
- adjective-noun pairs, study of,124–126
- adjectives
- as important modifiers,182
- as indicators of sentiment,125–126, 128–130, 140
- analysis of Australian Hansard data,57
- analysis of data,264–265
- analysis, multidimensional,63
- analysis, statistical
- of British Hansard,97
- of transcripts vs. Australian Hansard,64
- analytical techniques,18
- apodosis and entonces or pues,325–326
- apodosis and then,315, 325–326
- assembly, lawful vs. unlawful,102–103
- attitude shift, indicators of,62
- audibility, effect on accuracy of Hansard,33
- Australian English, linguistic evolution of,255–256, 273
- Australian Hansard
- analysis of data from,57
- comparison with transcripts,60
- decontextualisation in,58
- reduction of interpersonal elements,58
- verbatim policy,258
- Australian parliament,8
- Australian parliamentary discourse, orthographic transcription of,60See British Hansard 1889–1908
- automated data classification: pros and cons,74, 78–79
B
- background variables,5
- Barrow vs. Hansard rivalry,35–36
- big data
- perspective,12
- pros and cons,167–168
- techniques,302–303
- Brexit,6, 9, 143–163
- Britain vs. Europe,138
- British Empire,9, 122–123
- British English,10
- British Hansard
- accuracy of,26, 33, 200
- as a complete dataset,199–200
- as a verbatim transcript,25–26
- considerations pre- and post-1909,200
- contributors to,50
- correction of,33
- detail of vs. newspapers,38
- digitisation of,22, 90
- employment relationship in,175–189
- fidelity of,27, 91
- incompleteness of,34, 46
- linguistic outline 1803–present,49
- production of,259
- reported speech in the early,31
- size of by year and series,22
- sources of,36–37
- the series of,23
- two major phases of,22
- British Hansard 1803–1888
- linguistic characteristics,38–39
- overview,29–40
- British Hansard 1889–1908
- funding challenges,40
- linguistic characteristics,42
- overview,40–43
- British Hansard 1909–present
- full-length verbatim reporting,43
- in-house reporting,43
- overview,43–48
- British Hansard Corpus, history and language,7
- British Hansard in the 1800s,21
- British Hansard publishers,23–24, 42
- British Hansard's volumes, structure of,21
- British parliament,11
- British parliament as a debate parliament,309
- British vs. Australian Hansard: restrictive relative clause markers,265–268, 271
C
- close reading,59, 61–62, 64–72, 120–121, 188
- close reading vs. distant reading,8, 56–57, 85–86, 120–121
- Cobbett's Parliamentary Debates,20, 23
- codeswitching in parliaments,7
- collocation of
- lexical verbs,158, 161
- lunatic,218–221
- collocation, computational,98
- colloquialisation
- and democratisation,336–337, 366
- contractions in,351–353
- effect on relativiser choice,270
- in Australian Hansard,65
- in parliamentary discourse,310
- of English,11
- of parliamentary language,6
- shorter sentences as indicators of,351
- colonial language, parliamentary discourse as a barometer of,6, 127, 134–135
- colonial policy: effect on language,283–284
- colonies, language of,11
- communication within parliament,4, 262
- compression of English,11
- concordance
- corpora,156
- research,98, 101, 108
- searches,287
- concordance statistic (C-index),269
- conditional construction, position of if/si,326–328
- conditionals
- analysis of,315
- degree of likelihood of,323–325
- discourse-pragmatic functions,311–312
- frequency in parliamentary discourse,316–318
- metafunctions of,318–322
- parliamentary discourse and,313–314
- conservatism, stylisticSee editorial decisions
- contemporary corpora, comparison with Hansard Corpus,119
- contextualisation of foreign immigrant,123–124
- contractions in colloquialisation,351–353
- corpora
- as indicators of changes in society,9
- design of,2–3
- to study social changes and events,9
- to track sociocultural changes and events,10
- corpus
- characteristics of a,2, 121
- compilation of a,2–3
- corpus linguistics
- in cultural studies,134–139
- keywords,98
- corpus-assisted discourse studies (CADS),144
- corpus-linguistic methods, value of,12
- corpus-pragmatic approach,228, 234–236
- correction of British Hansard, rules for,92
- critical discourse analysis (CDA),143–145
- cross-linguistic research,308–330
- cultural studies, corpus linguistics in,134–139
D
- data
- accessibility of,1–2, 90, 95–96
- different forms of,1–2
- historical,1–2
- interpretation thereof from distant reading analyses,75
- reliability of,12
- searchable,3, 90, 93, 95–96, 99, 102, 115
- use of,1–2, 91
- visualisation of,110–116
- data analysis
- inclusion vs. exclusion criteria,147
- validity of,246
- data collection
- automated (online),4, 279
- systematic,4
- data selection for corpus construction,285
- data, parliamentary
- different uses of,11
- nature of parameters studied,6
- dataset
- Hansard as a complete,199–200
- House of Commons,143, 163
- datasets, large
- accessibility of,8
- power and problems of,3
- usefulness of,4
- De Nationale Assemblée (DNA)
- composition of,284
- rules of conduct,285
- usefulness of data,284–285
- democracy indices, correlation with,345
- democratisation,9, 167–191
- and colloquialisation,336–337, 366
- as a political concept,170
- definition of,194
- effect of Industrial Revolution on,169
- effect on language,194
- inclusive language in,365
- linguistic definition,337–338
- political definition,337–338
- topic modelling with,359–364, 366–367
- demographic composition of parliaments,5
- demonstratives: use of in Australian Hansard,79, 82
- Diario de Sesiones del Congreso de los Diputados (Journal of Sittings of the Spanish Parliament),310–314, 316–318
- digital collections,3
- usefulness of,278
- digital corpora, production of,279
- Digital Humanities,3, 56, 121, 336
- digitisation of British Hansard,90, 100
- dimension score analysis,72–76
- direct speech (DS),28
- discourse markers
- function in parliamentary discourse,70
- in Australian Hansard,69–71
- discrimination, colour, of the 1970s,137–138
- discussion distribution, patterns of,102–103
- diglossia, 'leaky', as an indicator of language shift,281
- distant reading,63–64, 72–80, 120–121
- distant reading analyses, interpretation of,76, 242
- distant reading vs. close,8, 56–57, 85–86, 120–121
- distributional analysis of relative markers,265–268
- DNA corpus
- construction and analysis,285–288
- distribution of er in,293–294
- frequency of lexemes in,296–297
- Dutch, Netherlandic
- Dutch, Netherlandic vs. Surinamese,289–301
- Dutch, Surinamese,277, 279, 281–284
E
- editing of transcripts to produce Hansard,58
- editorial decisions
- consistency of,70
- effect on Australian Hansard,61, 65–69, 76–79, 82–83, 86
- effect on Hansard Corpus,27, 38, 91, 171–172
- editorial policy
- Australian Hansard,55
- British Hansard,34
- changes over time,10, 58, 68–69, 71–72
- effect on linguistic change,348, 366
- education and language,358
- Elan project,286–287
- emigrationSee migration
- employment relationship in the British Hansard,175–189
- er (there)
- distribution in DNA corpus,293–294
- Dutch particle,289–294
- in Belgian Dutch,290
- uses in Netherlandic Dutch,290–292
- uses in Sranantongo,292–293
- European Union
- British attitude towards,138
- inclusion of the UK,144
- study of pronouns to indicate attitude of British MPs towards,9, 149–153
- expert knowledge
- definition of,227, 233
- in policymaking,227–228
- to support political claims,228, 244
- expert, rhetorical function of terms,243–246
- expertise, types of,232–233
- experts
- characteristics of,229
- communication strategies of,230
- role in technical decision-making,230–231
- extensible markup language (XML) version of the Hansard Corpus,145, 201
F
- first-person content,31
- first-person pronouns
- as indicators of colloquialisation,344, 348
- as indicators of rhetorical strategy,67
- categories in the British Hansard,150
- frequency in the Hansard Corpus,32, 41–42
- first-person vs. third-person pronouns,66–69
- first-person vs. third-person reporting,42–43
- foreign immigrant, contextualisation of,123–124
- formalisation of Hansard content,58
- formality
- changes in,69, 78
- indicators of,66–69, 71–72, 82–83
- free direct speech (FDS),28, 30, 39, 48
- free indirect speech (FIS),28
- full-length verbatim reporting,43
H
- Hansard
- 2017 definition of,46
- as a written representation of spoken discourse,55, 90, 121
- purpose of,49, 90–91
- rivalry vs. Barrow,35–36
- use by other Commonwealth countries,258
- usefulness to linguists,26, 50, 91
- Hansard at Huddersfield
- function of,94
- interpretation of data from,101, 115
- keyword plots,113–114
- minimising misinterpretation,115
- searching the,99, 102
- tagging inaccuracies in,94
- to track historical events or figures,99–103
- to track topics,109–113
- to track word usage,104–109, 112
- Hansard competitors: Mirror/Barrow,35
- Hansard content, formalisation of,58
- Hansard corpora, versions of,25
- Hansard Corpus
- accessibility of,24, 90
- analysis of,18
- complexity of,18
- density of,18
- effect of editorial decisions on,27
- extensible markup language (XML) version,145, 201
- online,171, 201
- size of,18, 202
- understanding of,18
- Hansard language, features of,17
- Hansard Publishing Union,40–41
- Hansard's Parliamentary Debates,23
- Hansard's Parliamentary Debates vs. Official Report,31
- Hansard's Parliamentary Debates, accuracy of,36–37
- historical events
- effect on language,280
- indicators of,186–189
- Historical Thesaurus of English, use of,9–10, 225–226, 235
- House of Commons,9, 19, 21, 33, 105, 143, 163
- House of Commons vs. House of Lords: linguistic patterns,340, 354–356
- House of Lords,21, 33, 199
- House of Lords verbosity,356
I
- immigrationSee migration
- imperfect marking: Surinamese vs. Netherlandic Dutch,299–301
- inclusion vs. exclusion criteria in data analysis,147
- inclusive language in democratisation,365
- India, colonial attitude towards,134–136
- indicators of activity, verbs as,157, 160
- indicators of attitude shift,62
- indicators of sentiment towards master–servant relationship,173
- indicators of colloquialisation, first-person pronouns as,344, 348
- indicators of cultural change,235, 240
- indicators of democratisation, adjectival and prepositional modifications as,167–168
- indicators of formality,66–69, 71–72
- indicators of narrative vs. non-narrative reporting,73
- indicators of rhetorical strategy, first-person pronouns as,67
- indicators of sentiment of Britain towards the EU,159, 162
- indicators of sentiment towards mental illness,195, 197, 199, 213, 215, 220–222
- indicators of sentiment, adjectives as,125–126, 128–130, 140
- indirect speech (IS),28, 30
- Industrial Revolution
- effect on democratisation,169
- effect on language,339
- ingroup inclusion
- effect of 21st century EU developments,151–152
- indicators of,159–160
- in-house reporting in British Hansard 1909–present,43
- innovation, noun compounds as indicators of,349
- in-person reporting, importance of,47
- interface of spoken and written language,6–7
- interpretation in parliaments,7
- IQ and race,137
- Irish
- as a race,134
- discourse regarding mental health,218
K
- key terms
- democratisation,9, 172–173, 180, 186, 189–190
- employment,175, 179
- expert,236–241
- expert knowledge,241–243
- master, frequencies of referents of,182, 190
- mental health,195, 199, 201–205, 225–226
- mental health patients,206–211
- migration,124
- personal pronouns,9
- psychiatric illnesses,212–213
- psychiatric institutions,212–215
- race in the Hansard Corpus,119–120
- servant, frequency of,190
- which, that,10
- keywords in corpus linguistics,98, 113
L
- labour legislation, changes in,169–170
- Lancaster-Oslo-Bergen (LOB) Corpus,22
- language processing,263–264
- language transmission in Suriname,282
- legal discourse, parliamentary discourse as,313
- legislation, mental health: influence on and by discourse,198–199, 220–222
- legislation: master–servant relationship,169
- lemma search,128
- lexemes in the DNA corpus,296–297
- lexical shift,216
- lexical verbs
- collocation of,158, 161
- study of,156–157, 160
- lingua franca, unofficial, in Suriname,284
- linguistic changes and social changes, relationship between,340–341
- linguistic changes over time,347–353
- linguistic corpora,1–2
- linguistic features and external indices, correlation of,356–358
- linguistic norms, distant central,6
- linguistic outliers as indicators of historical events,186–189
- linguistic patterning,11
- linguistic patterns
- changes in parliamentary,340–341
- differences between House of Lords and House of Commons,340, 354–356
- subconscious,174
- linguistic trauma, definition of,202
- linguistic usefulness of Hansard,26
- linguistic variation,11
- linguistics, scope of,1
- local use (vs. distant central linguistic norms),6
- lunatic, case study of the term,217–220
M
- Margaret Thatcher,109
- markers of conditionality, prototypical: in English and Spanish,308–310
- master–servant relationship
- indicators of changing sentiment towards,173
- legislation governing,169
- mediation in parliaments,7
- megacorpus, characteristics of the ideal,4
- mental health discourse, highest frequency of,205
- mental health in Britain
- indicators of sentiment towards,195, 207
- overview,9–10
- pre-1800,196
- stigma of,195
- mental health, emergence of new terms,197
- mental health, major shifts in lexis thereof,220–222
- metafunctions of conditionals in Spanish,318
- migration,9, 122–126
- effect on language,280
- indicators of sentiment towards,139–140
- tracking of word usage,124
- use of Hansard to establish historical discourse,139–140
- Mirror of Parliament
- accuracy of,39
- competitor of British Hansard,34–35
- modal verbs,351
- modals
- as indicators of formality,71–72
- in Australian Hansard,71–72
- modification patterns, adjectival and prepositional,167–168
- Multidimensional Analysis Tagger (MAT),63–64
- multilingualism in parliaments,7
N
- narrative report of speech acts (NRSA),28, 30–31, 39, 42
- narrative style, trend towards,73
- narrative vs. non-narrative reporting, indicators of,73
- narrator vs. reporter, difference between,27
- noun compounds as indicators,131
- noun compounds as indicators of innovation,349
- nouns and noun compounds, frequency of,343–344, 348
- NRSA sources, limitations of,34
O
- Official Report
- accuracy of,45–46
- construction of,46–47
- editorial policy,44–45, 47–48
- inauguration of,21
- Official Report vs. Hansard's Parliamentary Debates,31
- Official Report vs. transcripts: discrepancies of pronouns,174
- Old Bailey Corpus,173
- one-third rule,41–42
- oral discourse vs. written,8
- orthographic transcription of Australian parliamentary discourse,60
- outgroup languages in Suriname,281
P
- parallel corpora,8
- parliamentary archives/records,1–2
- parliamentary corpora, comparison of spoken and written,303
- parliamentary data
- analysis of,8
- rich potential of,8
- Parliamentary Debates (Official Report),24
- Parliamentary Debates
- changes in content of,341–342
- production after Hansard business was sold,21
- parliamentary discourse,1
- accessibility of,338
- as a barometer of colonial vs. post-colonial language,6, 127–128
- as legal discourse,313
- changes in recording techniques over time,339
- colloquialisation in,310
- complete dataset of,199–200
- conditionals in,313–314
- effect of time on,5
- Parliamentary Discourse project,22
- parliamentary language,1
- colloquialisation of,6
- corpora of,279
- definition of,4
- driver or inhibitor of change,278
- history of,119
- parliamentary records, oral vs. official,8
- parliamentary reporting
- accuracy of,19–20, 200
- increase before 1803,20
- prohibition of in the UK,19
- parliamentary speeches
- interventions in,127–128
- nuances of reporting of,48
- parliaments, demographic composition of,5
- passive forms, frequency of,344
- personal pronoun use in ingroup vs. outgroup separation,162
- personal pronoun we,143–145
- personal pronouns,9
- Peterloo Massacre,99–103
- political party differences: attitude towards EU,153–155
- polysemous terms in mental health discourse,204
- possessive constructions in employment-related terms in the British Hansard,176, 178, 191
- POS-tagging,344
- post-colonial language, parliamentary discourse as a barometer of,6
- pragmatism in parliamentary discourse,11
- predictors of relative marker variation,253, 261–264
- predictors of relativiser choice,269–271
- prepositional modification patterns as indicators of democratisation,167–168
- prescriptivism
- effect on speech and writing,257–258
- in restrictive relative clause choice,252, 256–258, 270
- specific to Hansard,262
- prescriptivism and processing, interplay of,10
- pronouns as indicators of formality,66–69
- pronouns, discrepancies in Official Report vs. transcripts,174
- pronoun-verb compounds, study of,148
- protasis–apodosis position, preferential,330
- publishers of British Hansard,23–24
Q
- quotations within the Hansard,123
R
- race,9
- 18th and 19th century definition of,127
- approaches to in linguistics,121
- change in usage,127–128
- colonial vs. contemporary discourse,127, 130–131
- raciolinguistics, purpose of,121–122
- racism,133, 136
- recency of nouns,349–350
- referents
- first-person pronouns in British Hansard,150
- importance of,107
- register variation,72
- relative clauses, definition of restrictive vs. non-restrictive,254
- relative marker variation, predictors of,253, 261–264
- relative markers which, that,10
- relative markers which, that, distributional analysis of,265–268
- relativiser choice, predictors of,269–270
- relativisers
- formal,251
- informal,251
- wh-,255
- reliability of data,12
- reported speech in the early Hansard,31
- reporter vs. narrator, difference between,27
- research design
- corpus pragmatic approach,228, 234–236
- mixed methods,228, 234–236
- quantitative vs. qualitative,4, 120
- research methodology
- collocation,146
- concordance,146
- manual analysis/annotation,315
- mixed methods,195
- qualitative,93
- sampling,259–260
- variable clustering,215–217
- research potential: British Hansard,91–92
- restrictive clauses, markers of,10
- restrictive relative clause markers
- British vs. Australian Hansard,265–268, 271
- differences in American English,255
- prior research into,251–252
- rhetorical strategy, indicators of,67
S
- sampling techniques: randomised tokens,314, 342–343
- search strings, use of,122
- search techniques
- adjective-noun compound,183
- comparison of,125
- lemma search,128, 130
- lexical verbs,156
- search span,146
- selection criteria,260
- term selection,195
- searches, regex,300–301
- second-person pronouns in parliamentary discourse,68, 262
- semantic shift: lunatic,207–209
- sentence length as an indicator of colloquialisation,351
- shorthand records neglected by newspapers,36
- social change,11
- social changes and linguistic changes, relationship between,340–341
- sociocultural changes and events
- effect on language,10
- indicators of,9
- sociocultural factors: impact on language,9
- sociolinguistic variables,5
- Spanish grammar, conditionals in,312See Diario de Sesiones del Congreso de los Diputados
- Spanish parliament,11
- Spanish parliament as a working parliament,309
- speech report interference,27
- speech representation, factors that influence,7
- speech to text: automated,286
- advantages of,288
- inaccuracy of,287
- speeches, uncorrected, in British Hansard,33
- speech-to-text software,11
- spoken discourse, reduction of features thereof in Australian Hansard,80–82
- spoken English vs. written,252
- spoken language vs. written,6–7
- spoken language, variations in,11
- Sranantongo (Sranan),277, 280–282, 284
- Stanford Tagger,260
- statistical analysis
- of frequencies,205, 252
- of quantitative results,205–217, 252
- sub-corpora, complementary,60–61
- sub-corpora, complementary: features for investigation,61
- supplementary contemporary material, British Hansard,50
- Suriname
- language transmission in,282
- languages of,278
- linguistic history of,280–283
- parliamentary language of,302
T
- tagging of corpus text,24
- term frequency by genre,350–351
- term selection,236–238, 241
- European Community,146
- mental health,201–204
- search strings,122
- searching Hansard at Huddersfield,107
- searching the Hansard Corpus,172–173
- terminology shifts as a result of World War II,10
- that as a restrictive relative clause marker, preference for,254
- that as an informal relativiser,251
- that-rule,252, 256–257, 268
- third-person vs. first-person pronouns,66–69
- Thomas Curson (T.C.) Hansard Jnr,20
- Thomas Curson (T.C.) Hansard Jnr, retirement of,21, 40
- Thomas Curson (T.C.) Hansard Snr,20
- Thomas Curson (T.C.) Hansard Snr, death of,21
- time (of year and during election cycle), effect on parliamentary discourse,5
- timeframe as a predictor of relative marker alternation,271
- topic modelling,345–346, 359–364, 366–367
- topics
- frequency and significance of,39
- peaks of discussion,123
- tracking
- of democratisation,167–168
- of historical discourse,139–140
- of historical events or figures,99–103, 186–189
- of topics,109–113
- of word usage,104–109, 112, 119–120, 124
- transcription as the interface between spoken and written language,6–7
- transcription, orthographic: of Australian parliamentary discourse,60
- transcripts
- clarity in,26
- different types of,26
- fidelity of,28
- transcripts vs. Australian Hansard
- changes in frequency of selected markers,65
- convergence over time,86–87
- reduction of spoken discourse features,80–82
- use of demonstratives,79, 82
- transcripts vs. Official Report: discrepancies of pronouns,174
- translation in parliaments,7
- turnaround time of British Hansard,90
V
- variable clustering,215–217
- variation, register,72
- verbatim
- definition of,26, 43
- elements most likely to be reported,93
- verbatim policy, Australian Hansard,258
- verbs as indicators
- of activity,157, 160
- of sentiment,159
- video recordings of Surinamese parliament,278
W
- which as a formal relativiser,251
- which as a restrictive relative clause marker, preference for,252
- word clouds,110–112
- word order: Surinamese vs. Netherlandic Dutch,295–299
- World War II, effects on language,10
- written discourse vs. oral,8
- written English vs. spoken,252
- written language vs. spoken,6–7
