In:Discourse Markers and (Dis)fluency: Forms and functions across languages and registers
Ludivine Crible
[Pragmatics & Beyond New Series 286] 2018
► pp. 55–80
Chapter 4Corpus and method
Published online: 1 March 2018
https://doi.org/10.1075/pbns.286.c4
https://doi.org/10.1075/pbns.286.c4
Article outline
- 4.1
The DisFrEn dataset
- 4.1.1Source corpora
- 4.1.2Comparable corpus design
- 4.1.3Corpus structure in situational features
- 4.2
Discourse marker annotation
- 4.2.1 Identification of DM tokens
- 4.2.2Functional taxonomy
- 4.2.3Three-fold positioning system
- 4.2.4Other variables
- 4.2.5Annotation procedure
- 4.2.5.1Software
- 4.2.5.2Disambiguation method
- 4.3Disfluency annotation
- 4.3.1Simple fluencemes
- 4.3.1.1 Silent pauses
- 4.3.1.2Filled pauses
- 4.3.1.3 Explicit editing terms
- 4.3.1.4False-starts
- 4.3.1.5Truncations
- 4.3.2Compound fluencemes
- 4.3.2.1Identical repetitions
- 4.3.2.2Modified repetitions
- 4.3.2.3Morphosyntactic substitutions
- 4.3.2.4Propositional substitutions
- 4.3.3Related phenomena and diacritics
- 4.3.4Annotation procedure
- 4.3.4.1Technical format
- 4.3.4.2 Scope of the disfluency annotation
- 4.3.4.3Replicability of the disfluency annotation
- 4.3.5Macro-labels of sequences
- 4.3.1Simple fluencemes
- 4.4Summary
Notes
