In:Multimodal Communication from a Construction Grammar Perspective
Edited by Kiki Nikiforidou and Mirjam Fried
[Constructional Approaches to Language 38] 2025
► pp. 154–189
A multimodal approach to coordination in spontaneous conversation
Published online: 16 January 2025
https://doi.org/10.1075/cal.38.05lel
https://doi.org/10.1075/cal.38.05lel
Abstract
This chapter proposes a constructional framework that includes the verbal, vocal, and gestural
modalities to describe coordination in conversation. I suggest a definition for coordination that is not
modality-specific, and provide a detailed analysis of two coordinate structures from a corpus of spontaneous speech in
British English that illustrates this definition. To assess its implications, a series of exploratory analyses
investigating a relationship between discourse sequence type and coordination was carried out. This study is the first
step into a new model for coordination that contributes to the development of a cognitive-linguistic approach to
multimodal and interactional features of language use.
Keywords: coordination, spontaneous speech, prosody, gesture, Construction Grammar
Article outline
- 1.Introduction
- 2.Theoretical background
- 2.1Cognitive and Construction grammars
- 2.2Coordination in other linguistic subfields
- 2.3Coordination between discourse units
- 2.4Coordination between prosodic units
- 2.5Coordination between gesture units
- 3.Methodology
- 3.1Research questions
- 3.2Corpus transcription and annotation
- 3.3Working definition of coordination and annotation
- 4.Coordination: Detailed analysis of two examples
- 4.1Coordination in a description sequence
- 4.2Coordination in a question-answer sequence
- 4.3Summary
- 5.Corpus overview
- 6.Exploratory analyses I and II
- 6.1Analysis I. Discourse sequence type
- 6.1.2Discussion of analysis I
- 6.2Analysis II. Coordination modality
- 6.2.1Discussion of analysis II
- 6.1Analysis I. Discourse sequence type
- 7.General discussion and conclusion
- 7.1Methodological developments
- 7.2Perspectives
- Supplementary material
Notes Bibliography Appendix
References (87)
Acuña-Fariña, J. C. (2006). A
constructional network in appositive space. Cognitive
Linguistics, 17(1), 1–37.
Alibali, M. W., Kita, S., & Young, A. J. (2000). Gesture
and the process of speech production: We think, therefore we gesture. Language
and Cognitive
Processes, 15(6), 593–613.
Allwood, J., Cerrato, L., Dybkjaer, L., Jokinen, C., Navarretta, C., & Paggio, P. (2007). The
MUMIN coding scheme for the annotation of feedback, turn management, and sequencing
phenomena. International Journal of Language Resources and
Evaluation, 41, 273–287.
Angelopoulou, G., Kasselimis, D., Goutsos, D., & Potagas, C. (2024). A
methodological approach to quantifying silent pauses, speech rate, and articulation rate across distinct
narrative tasks: Introducing the Connected Speech Analysis Protocol
(CSAP). Brain
Sciences, 14(5), 466.
Asher, N., & Vieu, L. (2005). Subordinating
and coordinating discourse
relations. Lingua, 115(4), 591–610.
Barkhuysen, P., Krahmer, E., & Swerts, M. (2008). The
interplay between the auditory and visual modality for end-of-utterance
detection. The Journal of the Acoustical Society of
America, 123(1), 354–365.
Bates, D., Maechler, M., Bolker, B., & Walker, S. (2017). Linear
mixed-effects models using eigen and s4. [URL]
Blühdorn, H. (2008). Subordination
and coordination in syntax, semantics and
discourse. In C. Fabricius-Hansen, & W. Ramm (Eds.) Subordination
vs. coordination in sentence and text. A cross-linguistic
perspective (pp. 59–85). Amsterdam, John Benjamins.
Boersma, P., & Weenink, D. (2013). Praat:
Doing Phonetics by Computer. Retrieved 30 January 2013, from [URL]
Brennan, S. E., & Hanna, J. E. (2009). Partner-specific
adaptation in dialogue. Topics in Cognitive
Science, 1, 274–291.
Bressem, J. (2021). Repetitions
in gesture: A cognitive-linguistic and usage-based
perspective. Berlin: De Gruyter Mouton.
Bressem, J., & Ladewig, S. (2011). Rethinking
gesture phases: Articulatory features of gestural
movement? Semiotica, 184, 53–91.
Bryant, L., Spencer, E., & Ferguson, A. (2017). Clinical
use of discourse analysis for the assessment of language in
aphasia. Aphasiology, 31(10), 1105–1126.
Calbris, G. (2005). Le
geste coverbal, un signe analogique. Cahiers de Linguistique
Analogique, 5, 53–80.
Cavé, C., Guaïtella, I., Bertrand, R., Santi, S., Harlay, F., & Espesser, R. (1996). About
the relationship between eyebrow movements and Fo variations. Proceedings of
the Fourth International Conference on Spoken Language (ICSLP
96) 4 (pp. 2175–2178). Philadelphia, PA.: IEEE.
Clark, H., & Fox Tree, J. (2002). Using
uh and um in spontaneous
speaking. Cognition, 84(1), 73–111.
Crible, L. (2022). The
syntax and semantics of coherence relations. From relative configurations to predictive
signals. International Journal of Corpus
Linguistics, 27(1), 59–92.
Crible, L., & Degand, L. (2019). Domains
and functions: A two-dimensional account of discourse
markers. Discours, 24.
Crible, L., Degand, L., & Gilquin, G. (2017). The
clustering of discourse markers and filled pauses: A corpus-based French-English study of (dis)
fluency. Languages in
Contrast, 17(1), 69–95.
Croft, W. (2001). Radical
Construction Grammar: Syntactic theory in typological perspective. Oxford, UK: Oxford University Press.
Davidson, K. (2013). “And”
or “or”: General use of coordination in ASL. Semantics and
Pragmatics, 6, 1–44.
De Kok, I., & Heylen, D. (2009). Multimodal
end-of-turn prediction in multi-party meetings. Proceedings of the 2009
International Conference on Multimodal
Interfaces (pp.91–98). New York: ACM.
de Pijper, J. R., & Sanderman, A. A. (1994). On
the perceptual strength of prosodic boundaries and its relation to suprasegmental
cues. The Journal of the Acoustical Society of
America, 96(4), 2037–2047.
Degutyte, Z., & Astell, A. (2021). The
role of eye gaze in regulating turn taking in conversations: A systematized review of methods and
findings. Frontiers in
Psychology, 12, 1–22.
Dingemanse, M., Rossi, G., & Floyd, S. (2017). Place
reference in story beginnings: A cross-linguistic study of narrative and interactional
affordances. Language in
Society, 46(2), 129–158.
Duez, D. (1982). Silent
and non-silent pauses in three speech styles. Language and
Speech, 25(1), 11–28.
Enfield, N. J. (2009). The
anatomy of meaning: Speech, gesture and composite utterances. Cambridge, UK: Cambridge University Press.
Fauconnier, G., & Turner, M. (2006). Mental
spaces: Conceptual integration networks. In D. Geeraerts (Ed.), Cognitive
Linguistics: Basic
readings (pp. 303–371). New York: Mouton de Gruyter.
Ferré, G. (2004). Les
pauses intra-constituants en anglais
spontané. In XXVe Journées d’Etudes sur la
Parole (pp. 217–220). Fès: ISCA.
Ficler, J. (2016). Coordination
annotation extension in the Penn Tree
Bank. In Proceedings of the 54th Annual Meeting of
the Association for Computational
Linguistics (pp. 834–842). ACL.
Fussell, S. R., & Krauss, R. M. (1989). The
effects of intended audience on message production and comprehension: Reference in a common ground
framework. Journal of Experimental Social
Psychology, 25, 203–219.
(1992). Coordination
of knowledge in communication: effects of speakers’ assumptions about what others
know. Journal of Personality and Social
Psychology, 62, 378–391.
Galati, A., & Brennan, S. E. (2010). Attenuating
information in spoken communication: for the speaker, or for the
addressee? Journal of Memory and
Language, 62(1), 35–51.
(2014). Speakers
adapt gestures to addressees’ knowledge: Implications for models of co-speech
gesture. Language, Cognition and
Neuroscience, 29(4), 435–451.
Granström, B., House, D., & Lundeberg, M. (1999). Prosodic
cues in multimodal speech perception. Proceedings of the International Congress
of Phonetic Sciences
(ICPhS99) (pp. 655–658). Berkeley, CA.: University of California.
Hartmann, K., Pfau, R., & Legeland, I. (2021). Asymmetry
and contrast: Coordination in Sign Language of the
Netherlands. Glossa, 6(1), 1–33.
Haspelmath, M. (2007). Coordination. In T. Shopen (Ed.), Language
typology and syntactic constructions: Complex
constructions (pp. 1–51). Cambridge, UK: Cambridge University Press.
Heylen, D. (2006). Head
gestures, gaze, and the principles of conversational structure. International
Journal of Humanoid
Robotics, 3(3), 241–267.
Hirrel, L. (2018). Cyclic
gestures and multimodal symbolic assemblies: An argument for symbolic complexity in
gesture. Albuquerque, NM.: University of New Mexico.
Hirst, D. J. (2007). A
Praat plugin for Momel and INTSINT with improved algorithms for modelling and coding
intonation. Proceedings of the XVIth International Conference of Phonetic
Sciences (pp. 1233–1236). Saarbücken: Univ des Saarlandes.
Hoetjes, M., Koolen, R., Goudbeek, M., Krahmer, E., & Swerts, M. (2015). Reduction
in gesture during the production of repeated references. Journal of Memory and
Language, 79, 1–17.
Holler, J., & Wilkin, K. (2009). Communicating
common ground: How mutually shared knowledge influences speech and gesture in a narrative
task. Language and Cognitive
Processes, 24(2), 267–289.
Hopper, P. J. (2021). “You
turn your back and there’s somebody moving in”: Syntactic anacrusis in spoken
English. Interactional
Linguistics, 1(1), 64–89.
Hostetter, A. B., & Alibali, M. W. (2007). Raise
your hand if you’re spatial: Relations between verbal and spatial skills and gesture
production. Gesture, 7(1), 73–95.
House, D., Beskow, J., & Granström, B. (2001). Timing
and interaction of visual cues for prominence in audiovisual speech
perception. In Seventh European Conference on Speech
Communication and
Technology (pp. 73–95). Aalborg: ISCA.
Isaacs, E. A., & Clark, H. H. (1987). References
in conversations between experts and novices. Journal of Experimental
Psychology:
General, 116, 26–37.
Jacobs, N., & Garnham, A. (2007). The
role of conversational hand gestures in a narrative task. Journal of Memory and
Language, 56(2), 291–303.
Jantunen, T. (2016). Clausal
coordination in Finnish Sign Language. Studies in
Language, 40(1), 204–234.
Kallini, J., & Fellbaum, C. (2021). A
corpus-based syntactic analysis of two-termed unlike
coordination. In Findings of the Association for
Computational Linguistics:
EMNLP 20 (pp. 3998–4008). Punta Cana: ACL.
Kentner, G., & Féry, C. (2013). A
new approach to prosodic grouping. The Linguistic
Review, 30, 277–311.
Kimbara, I. (2008). Gesture
form convergence in joint description. Journal of Nonverbal
Behaviour, 32, 123–131.
Kipp, M., Neff, M., & Albrecht, I. (2007). An
annotation scheme for conversational gestures: How to economically capture timing and
form. Language Resources &
Evaluation, 41, 325–339.
Kita, S., Ozyurek, A., Allen, S., Brown, A., Furman, R., & Ishizuka, T. (2007). Relations
between syntactic encoding and co-speech gestures: Implications for a model of speech and gesture
production. Language and Cognitive
Processes, 22(8), 1212–1236.
Kok, K., & Cienki, A. J. (2016). Cognitive
Grammar and gesture: Points of convergence, advances and challenges. Cognitive
Linguistics, 27(1), 67–100.
Krahmer, E., & Swerts, M. (2005). More
about brows. In Z. Ruttkay, & C. Pelachaud (Eds.), From
brows to trust: Evaluating embodied conversational
agents (pp. 191–216). Kluwer.
Krivokapić, J. (2014). Gestural
coordination at prosodic boundaries and its role for prosodic structure and speech planning
processes. Philosophical Transactions of the Royal Society
B, 369(20130397).
Lambrecht, K. (1998). There
was a farmer had a dog: Syntactic amalgams revisited. Proceedings of the 14th
Annual Meeting of the Berkeley Linguistic
Society (pp. 319–339). Berkeley, CA.: Berkeley Linguistic Society.
Langacker, R. W. (2008). Cognitive
Grammar. A basic introduction. Oxford, UK: Oxford University Press.
Lelandais, M., & Ferré, G. (2014). Multimodal
analysis of parentheticals in conversational speech. Multimodal
Communication, 3(2), 197–217.
Local, J. (1992). Continuing
and restarting. In P. Auer, & A. Di Luzio (Eds.), The
contextualization of
language (pp. 273–296). Amsterdam: John Benjamins.
Maier, W., Kübler, S., Hinrichs, E., & Krivanek, J. (2012). Annotating
coordination in the Penn Treebank. In Proceedings of
the 6th Linguistic Annotation
Workshop (pp. 166–174). Jeju: Association for Computational Linguistics.
Masson-Carro, I., Goudbeek, M., & Krahmer, E. (2016). Imposing
cognitive constraints on reference production: The interplay between speech and gesture during
grounding. Topics in Cognitive
Science, 8, 819–836.
Mo, Y. (2008). Duration
and intensity as perceptual cues for naïve listeners’ prominence and boundary
perception. In Proceedings of Speech Prosody
2008 (pp. 739–742). Campinas: ISCA.
Nakano, Y. I., Reinstein, G., Stocky, T., & Cassell, J. (2003). Towards
a model of face-to-face grounding. In Proceedings of
the 41st Annual Meeting on Association for Computational
Linguistics-Volume 1 (pp. 553–561). Sapporo: ACL.
Ramm, W., & Fabricius-Hansen, C. (2005). Coordination
and discourse-structural salience from a cross-linguistic
perspective. In Proceedings of the 6th International
Workshop on Multidisciplinary Approaches to Discourse (MAD’05) ‘Salience in
Discourse’ (pp. 119–128). Chorin: Stichting/Nodus.
Richard, A., Hirsch, F., Jacquin-Courtois, S., & Reilly, K. T. (2023). Self-interruptions
in breast cancer patients who complain of
anomia. In Proceedings of the 20th International
Congress of Phonetic Sciences (ICPhS
2023) (pp. 4012–4016). Prague: ISCA.
Rossano, F. (2013). Gaze
in conversation. In J. Sidnell, & T. Stivers (Eds.), The
Handbook of Conversation
Analysis (pp. 308–329). Malden, MA.: Blackwell.
Roy, J., Cole, J., & Mahrt, T. (2017). Individual
differences and patterns of convergence in prosody perception. Laboratory
Phonology, 8(1), 1–36.
Ruth-Hirrel, L., & Wilcox, S. (2018). Speech-gesture
constructions in Cognitive Grammar: The case of beats and points. Cognitive
Linguistics, 29(3).
Schoonjans, S. (2018). Modalpartikeln
als multimodale Konstruktionen. Eine korpusbasierte Kookkurrenzanalyse von Modalpartikeln und Gestik im
Deutschen. Berlin/Boston: De Gruyter.
Sekine, K., & Kita, S. (2017). The
listener automatically uses spatial story representations from the speaker’s cohesive gestures when processing
subsequent sentences without gestures. Acta
Psychologica, 179, 89–95.
Selkirk, E. O. (1984). Phonology
and syntax: The relation between sound and structure. Cambridge, MA.: MIT Press.
Selting, M. (2000). The
construction of units in conversational talk. Language in
Society, 29(04), 477–517.
Sloetjes, H., & Wittenburg, P. (2008). Annotation
by Category: ELAN and ISO DCR. In Proceedings of the
6th International Conference on Language Resources and Evaluation. Marrakech, Morocco. [URL]
Steen, F., Hougaard, A., Joo, J., Olza, I., Pagan Canovas, C., Pleshakova, A., Ray, S., Uhrig, P., Valenzuela, J., Wozny, J., & Turner, M. (2018). Towards
an infrastructure for data-driven multimodal communication request. Linguistics
Vanguard, 4(1), 1–9.
Teranishi, H., Shindo, H., & Matsumoto, Y. (2017). Coordination
boundary identification with similarity and replaceability. Proceedings of the
Eighth International Joint Conference on Natural Language
Processing (pp. 264–272). Taipei: Asian Federation of Natural Language Processing.
